To help you further along, I found that if I muxed the audio and video as whole files I occasionally had audio sync problems. To resolve that, I would mix them as chunks. So my decryption process was something like (pseudocode):
Code:
create output.ts file
for each segment:
read segment from video
read segment from audio
decrypt video
write video segment to temporary.ts file
write audio segment to temporary.aac file
ffmpeg -i temporary.ts -i temporary.aac -acodec copy -vcodec copy muxed_temporary.ts
append muxed_temporary.ts to output.ts
// then convert to mp4
ffmpeg -i output.ts -acodec copy -vcodec copy -bsf:a aac_adtstoasc output.mp4
some players won't play the audio without the "-bsf:a aac_adtstoasc" flag (windows media player, iirc).
You are getting close...
