I was recently working on my NHK-Easy web crawler project, since the site got an update with a number of breaking changes. One of these changes was concerning the audio of the articles.
Previously, each article had a corresponding static .mp3 resource. The update introduced a new HLS (HTTP Live Streaming) media player, accompanied by a M3U8 playlist. Furthermore, I observed that the audio was now split into several ts file segments, which were loaded on demand.
I had never worked with these file types before, so off I went to google. However, it wasn’t exactly easy to find answers to my questions, which is why I decided to summarize my findings here, in case other people have similar issues. Also, I wanted to have a go at writing my first post.
Given a master M3U8 playlist, the goal is to end up with a single mp3 file.
The first step when dealing with M3U8, is parsing the master playlist.
Once the master playlist is parsed, one can read its content, which consists of one or more media playlists.
A typical media playlist could look like the following.
It contains an url for each segment that the stream is made up of. Additionally, there are a number of meta tags, of which one is particularly important.
What took me the longest to figure out was that the segments were encrypted. According to specification, MPEG-TS files (.ts) start with a BOM (Byte Order Mark) in the form of the ASCII char
G. What confused me was, that there was no
G, since the files were encrypted. Lesson learned, go read the specification.
Here is how I handle the encryption.
EncryptionData can be obtained from any track of the media playlist.
Critical for the success of decryption were the
chainmode and the
- Without any of those, e.g. just
AES, the decryption would throw.
NoPadding, but a different
chainmode, decryption would succeed, but the files were corrupted.
I simply used trial and error until I found the right combination.
MPEG-TS files are not the most ordinary media format and are therefore more difficult to work with. Instead of .ts files, it would be nice if we could just work with the audio as wav files. I have tried to find a way to do this in java, but ended up using FFMPEG because there was no suitable library for the job. There is JCodec, but the api is awkward and it doesn’t seem like the required format is supported. FFMPEG is pretty easy to use via cli and supports a range of formats.
To demux a .ts segment to a .wav file, simply use the below code, where the target is a .wav File.
Converting wav to mp3 is similarly easy:
Last step: merging the individual segments into one file.
I use the following to merge audio streams.
AudioInputStreams are obtained from files using
AudioSystem.getAudioInputStream and writing to file is done with
And that’s basically how I went from M3U8 playlists and encrypted ts segments to a single mp3.