xvid is a very good codec, but not for Real Time encoding. It needs lots of resources.
I would suggest to record into MPEG-2 instead. And then you can recompress MPEG-2 files with XVid or x264.
If you need better quality, you may
record from screen using HUFFYUV or Motion JPEG codecs instead of MPEG-2
As for
WM Capture audio problems, see the following thread:
Poor audio sync with WM Capture