This section is a follow-up to the pronunciation side of the Hiragana lessons, covering a couple additional things you need to know about in order to pronounce Japanese words correctly.
Rhythm and Intonation
Japanese is said to be a mora-timed language, meaning that each mora takes up roughly the same amount of time. This gives Japanese a kind of rapid-fire rhythm that (I think) is fairly easy to learn. English is at the opposite end of the spectrum – it's not a syllable-timed language, but a stress-timed language, where stressed syllables are roughly equally spaced apart and syllable length in between is variable.
We need to tread very carefully with the next concept, that of accent. By this I don't be the concept of "having an accent", but that of accents on particular syllables in words, or in the case of Japanese, particular moras.
English is said to use stress accent, where the stressed syllable is louder and longer than unstressed syllables. The is the difference between the noun "subject" and the verb "subject". Japanese, on the other hand, is said to have pitch accent. As usually taught, each mora of a word is either "low" or "high", that is, it has a relative pitch. Once you get to a "high" mora, each mora that follows is also high until the accented mora, after which the pitch drops to low for the rest of the word.
Most textbooks and dictionaries don't specify pitch accent, but the good news is that while some words are differentiated only by pitch accent, context will usually clear any ambiguity. The actual pitch changes are more gradual than implied above, and at the sentence level things get more complicated, as they do in any language. But the real problem with the above description is that physically, it's not quite accurate, which makes it hard to replicate just from hearing the standard explanation. But before we can discuss why, you need to know about another curious, but pervasive aspect of Japanese speech.
In the standard and many other dialects of Japanese, the vowels 'i' and 'u' are often devoiced between two voiceless consonants or following a voiceless consonant at the end of a word. This means that the mouth still takes and hold the shape of the vowel for the duration of the mora, it isn't voiced. The vowel may sound "whispered", or even deleted to foreigner speakers. Other vowels can be devoiced, but this occurs much less frequently. On the other hand, fully pronouncing vowels that would normally be devoiced is sometimes heard in certain female speech, formal speech, and some western dialects.
Two cases of a devoiced 'u' that you will encounter very soon are the copula ("to be" word) です (desu), which sounds like "des", and the verb suffix ます (masu), which sounds like "mas".
The Problem with Pitch Accent
The question now is this: if Japanese has devoiced vowels, then how can these moras have a "pitch"? I only became aware of this problem when I came across the article Against Marking Accent Locations in Japanese Textbooks.
The basic idea of the article is that the physical frequency of Japanese speech doesn't line up with speakers' perception of pitch, making it impossible for nonnative speakers to derive an accurate pronunciation from the typical written description of pitch accent. Even more interesting, English "stress" accent is also found to be most significantly based on pitch, then duration, and finally loudness.
The moral of the story is that for beginners trying to speak Japanese, the most important thing to focus on is keeping each mora the same length and loudness. As for pitch, remember that you will have certain tendencies carried over from English, so it may also be useful to try to keep your pitch relatively even until you've had some practice imitating native speech.
Onwards to Speaking
At this point, you have all of the information you'll need to speak Japanese with reasonable accuracy. From this point forward all you really need is practice.