Page 1 of 2

Show 40 post(s) from this thread on one page

Audio/video stream recording forums (http://stream-recorder.com/forum/index.php)

- Removing DRM protection from eBooks (http://stream-recorder.com/forum/forumdisplay.php?f=63)

- -

How to convert Topaz ebooks to HTML (Remove DRM from TPZ and AZW1 books for Kindle)

(http://stream-recorder.com/forum/showthread.php?t=5426)

any ANONYMOUS forum user

01-18-2010 04:07 AM

How to convert Topaz ebooks to HTML (Remove DRM from TPZ and AZW1 books for Kindle)

Quote:

Topaz is an Amazon format for Kindle devices. It differs from the AZW format in that it can have embedded fonts in the file itself. A .tan sidefile is used to store metadata and bookmarks and other user generated content on the eBook. The metadata is used to help the library mode to reference information about the eBook itself.

While not much is currently known about the internal format used in a Topaz file there is some likelihood that it is related to the standard AZW format. It uses a different compression than standard MOBI files and it can have embedded fonts in the file allowing more complex display using font sets and characters that are not standard to Amazon Kindle. It is also likely to remove other restrictions found in MOBI files such as image size limitations although some of these may have been removed in AZW as well.

According to one publishing industry blogger, Topaz is an implementation of the open EPUB standard. It follows the OEBPS 2.0 specs, and probably the later IDPF guides. It’s a proprietary implementation which means they use ePUB as the source but then convert it to their internal format.

AZW1 - is an eBook in the Topaz (TPZ) format that has been delivered via Whispernet.

TPZ - is an eBook in the Topaz format that that been delivered via Internet download.

The following is experimental and it will probably not work for you but…

ALSO: Please do not use any of this to steal. Theft is wrong.

This is only meant to allow conversion of Topaz books for other book readers you own.

Here are the steps:

First you must use the python scripts in topazscripts.zip to do the translation from Topaz to HTML

The files you should have after unzipping are:

cmbtc_dump.py – (author: cmbtc) unencrypts and dumps to files all of the sections, properly numbered and named

decode_meta.py – converts metadata0000.dat to human readable text

convert2xml.py – converts page*.dat, other*.dat, and glyphs*.dat files to their “pseudo” xml descriptions.

flatxml2html.py – converts a “flattened” xml description to html using the ocrtext and markup as its basis.

stylexml2css.py – converts stylesheet “flattened” xml from other0000.dat into css (as best it can – mainly supporting paragraph style classes)

genxml.py – main program to convert everything to xml

genhtml.py – main program to generate “book.html”
You must remove the DRM from the Topaz book and build a directory of its contents using the following commands:

cmbtc_dump.py -d -o TARGETDIR [-p pid] YOURTOPAZBOOKNAMEHERE

This should create a directory called “TARGETDIR” in your current directory.

It should have the following files in it:

metadata0000.dat – metadata info
other0000.dat – information used to create a style sheet
dict0000.dat – dictionary of words used to build page descriptions
page – directory filled with page*.dat files
glyphs – directory filled with glyphs*.dat files
You should convert the files in “TARGETDIR” to their xml descriptions
Please note, this python program uses “decode_meta.py” and “convert2xml.py” so don’t move them.

genxml.py TARGETDIR
Next attempt a conversion to html where “TARGETDIR” is the directory that was created in step 2.
Please note, this python program uses “decode_meta.py”, “convert2xml.py”, “flatxml2html.py”, and “stylexml2css.py” so don’t move them.

genhtml.py TARGETDIR

Once it completes:

You should have created the file “book.html” inside of TARGETDIR

You should also have created the directory xml inside of TARGETDIR
which has the full xml descriptions of the pages and glyphs for later
(better) conversion attempts.

One warning … this is not the best long-term solution because much of the layout is only really correct if drawn to the screen (as an svg). Until that solution exists, this should get you something that you can load into Sigil and clean up and make an ePub that you can then convert to other formats

Code:

http://www.pastie.org/760591

http://www.mediafire.com/?qmzjmt25yzf

http://rapidshare.com/files/336800633/topazscripts.zip.html

elch	03-21-2010 02:55 PM

Re: How to convert Topaz ebooks to HTML (Remove DRM from TPZ and AZW1 books for Kindl

Code:

http://pastie.org/761169.txt

seems to be more up-to-date.

vinografia

03-27-2010 02:40 AM

Re: How to convert Topaz ebooks to HTML (Remove DRM from TPZ and AZW1 books for Kindl

How do you use "http://pastie.org/761169.txt"? (I've tried using unswindle but it doesn't work on topaz.)

TIA

elch	03-28-2010 09:29 AM

Re: How to convert Topaz ebooks to HTML (Remove DRM from TPZ and AZW1 books for Kindl

I don't have a Kindle myself (or which device is necessary for Topaz e-books) but I'm a bit interested in encryption-related topics.

You need Python for this script. I'm not a Windows user anymore but there are precompiled binaries which should work fine.

Download the script, start the command line and type: "python script.py filename"

It accepts the following parameters:

Quote:

print("\nCMBDTC.py [options] bookFileName\n")
print("-p Adds a PID to the list of PIDs that are tried to decrypt the book key (can be used several times)")
print("-r Prints or writes to disk a record indicated in the form name:index (e.g \"img:0\")")
print("-o Output file name to write records")
print("-v Verbose (can be used several times)")
print("-i Print kindle.info database")

Stream Recorder

03-31-2010 01:18 AM

Re: How to convert Topaz ebooks to HTML (Remove DRM from TPZ and AZW1 books for Kindl

The following set of tools can also be used to remove DRM from Amazon Topaz eBooks:

TopazExtract_Kindle_iPhone.pyw,
TopazFiles2XML.pyw,
TopazFiles2SVG.pyw,
TopazFiles2HTML.pyw

tools_v1.6b.zip.

Code:

http://www.mediafire.com/?mn3vmttbwrt

The scripts should work with Kindle and iPhone Amazon Topaz Files (.tpz, .azw1). The files are really images of pages with OCR performed on them. Using the tools you can get SVG images of the pages, and the OCRed HTML version for clean-up.

maggerbee

04-03-2010 10:36 AM

Re: How to convert Topaz ebooks to HTML (Remove DRM from TPZ and AZW1 books for Kindl

WOW, thanks so much for that download! I successfully converted my purchased topaz from Amazon, into an HTML, and then used Calibre to convert it to .epub for use on my HTC Hero (Android Phone) using Aldiko book reader. Thanks sooo much!!!

slopsbox

04-04-2010 10:06 PM

Re: How to convert Topaz ebooks to HTML (Remove DRM from TPZ and AZW1 books for Kindl

I'm still unable to convert the files (xhtml) that I have, though the ebook itself has been stripped of DRM. I've tried merging the files with Adobe Acrobat Pro and Calibre, without success.

Can someone post the steps to do so? Thanks so much.

teebee

04-07-2010 08:30 PM

Re: How to convert Topaz ebooks to HTML (Remove DRM from TPZ and AZW1 books for Kindl

Is there anyone able to help me with an error message? I have successfully converted 3 books but am having trouble with the 4th. it strips the drm but when I go to convert it to xml I get the following error at page 256 "Error - -1501 outside of string table limits" i did some unsuccessful googling, so if anyone can help me I would appreciate it.

jcklaus

04-11-2010 03:44 PM

Re: How to convert Topaz ebooks to HTML (Remove DRM from TPZ and AZW1 books for Kindl

Running this I keep getting the error "Can not find dict0000.dat file" What am I doing wrong? Thanks.

any ANONYMOUS forum user

04-12-2010 04:12 AM

Re: How to convert Topaz ebooks to HTML (Remove DRM from TPZ and AZW1 books for Kindl

How to remove DRM from Topaz ebooks:

Install Python
Open command prompt / terminal and run:

Code:

python cmbtc_dump_nonK4PC.py -d -o TARGETDIR -p 12345678 YOURTOPAZBOOKNAMEHERE

where
- 12345678 - the first 8 characters of your PID
- "TARGETDIR" - target directory (can be ommited)
- YOURTOPAZBOOKNAMEHERE - filename of your Topaz ebook (with the .tpz extension)
Then, again in the command prompt / terminal, run:

Code:

python gensvg.py TARGETDIR
Then create HTML file from the SVG file by running the following in the command line / terminal:

Code:

python genhtml.py TARGETDIR

You should get "book.html" file in the TARGETDIR directory.
Convert "book.html" to any other format using Calibre.

any ANONYMOUS forum user

04-12-2010 04:12 AM

Re: How to convert Topaz ebooks to HTML (Remove DRM from TPZ and AZW1 books for Kindl

Quote:

Originally Posted by jcklaus (Post 18075)

Running this I keep getting the error "Can not find dict0000.dat file" What am I doing wrong?

Running what? On what OS? On what files?

jcklaus

04-12-2010 04:35 PM

Re: How to convert Topaz ebooks to HTML (Remove DRM from TPZ and AZW1 books for Kindl

Running Mac OS X Terminal and using the command line:

python TopazFiles2HTML.pyw MYTOPAZBOOKNAME

Stream Recorder

04-13-2010 12:43 AM

Re: How to convert Topaz ebooks to HTML (Remove DRM from TPZ and AZW1 books for Kindl

Quote:

Originally Posted by jcklaus (Post 18113)

Running Mac OS X Terminal and using the command line:

python TopazFiles2HTML.pyw MYTOPAZBOOKNAME

The files in the lib directory are used by the script. Make sure to extract the lib directory with the other scripts.

From my understanding, you need to run
1. TopazExtract_Kindle_iPhone.pyw
2. then run TopazFiles2XML.pyw,
3. and then run either TopazFiles2SVG.pyw or TopazFiles2HTML.pyw
May be I'm wrong. I don't have a Kindle to check it out.

You can also try to run cmbdtc.py on your Topaz ebook

Code:

cmbtc_dump.py -d -o TARGETDIR [-p pid] YOURTOPAZBOOKNAMEHERE

and see whether you get any errors.

Maradona10

04-30-2010 12:14 PM

Re: How to convert Topaz ebooks to HTML (Remove DRM from TPZ and AZW1 books for Kindl

Quote:

Originally Posted by Stream Recorder (Post 17767)

The following set of tools can also be used to remove DRM from Amazon Topaz eBooks:

TopazExtract_Kindle_iPhone.pyw,
TopazFiles2XML.pyw,
TopazFiles2SVG.pyw,
TopazFiles2HTML.pyw

tools_v1.6b.zip.

Code:

http://www.mediafire.com/?mn3vmttbwrt

I have downloaded these scripts. But can you please explain how to execute them properly and in which order? Cause I'm new to python. I have a topaz book on my iphone and want to convert it, but don't know how.

Any help appreciated.

jcklaus

05-01-2010 07:03 PM

Re: How to convert Topaz ebooks to HTML (Remove DRM from TPZ and AZW1 books for Kindl

Quote:

Originally Posted by Stream Recorder (Post 18120)

Code:

cmbtc_dump.py -d -o TARGETDIR [-p pid] YOURTOPAZBOOKNAMEHERE

and see whether you get any errors.

Whenever I try the first step of running TopazExtract_Kindle_Iphone.pyw I get the following error message:

File "./lib/cmbtc_dump_nonK4PC.py", line 517, in <module>
sys.exit(main())
File "./lib/cmbtc_dump_nonK4PC.py", line 478, in main
bookFile = openBook(args[0])
File "./lib/cmbtc_dump_nonK4PC.py", line 57, in openBook
raise CMBDTCFatal("Could not open book file: " + path)
__main__.CMBDTCFatal: Could not open book file:

Stream Recorder

05-02-2010 02:31 AM

Re: How to convert Topaz ebooks to HTML (Remove DRM from TPZ and AZW1 books for Kindl

Quote:

Originally Posted by jcklaus (Post 18666)

Are you trying to remove DRM from Topaz book? Or mobipocket book?

jcklaus

05-02-2010 08:20 AM

Re: How to convert Topaz ebooks to HTML (Remove DRM from TPZ and AZW1 books for Kindl

I'm trying to remove DRM from my Topaz azw1 files to eventually convert to epub

yankgirl013

06-01-2010 08:13 AM

Re: How to convert Topaz ebooks to HTML (Remove DRM from TPZ and AZW1 books for Kindl

Hi, I'm hoping someone can help me here. I'm trying to remove a DRM off a topaz file and I'm not getting anywhere. I'm using a Mac OS.
I've downloaded all the files listed, but I keep getting a 'Can not find dict0000.dat file' error.

Is there a way we can 'dumb' the directions down? I've removed them from azw and mobi using terminal and python scripts.

Thanks so much!!!

It really doesn't matter what I convert it to, I can just change it to epub using calibre

any ANONYMOUS forum user

06-01-2010 12:30 PM

Re: How to convert Topaz ebooks to HTML (Remove DRM from TPZ and AZW1 books for Kindl

Quote:

Originally Posted by yankgirl013 (Post 19314)

I keep getting a 'Can not find dict0000.dat file' error.

Not sure whether it will help

Quote:

Originally Posted by some updates

That means it can’t find your dict0000.dat file which should be right where you are.

This can not be the true error since other pages worked.

Please make sure that all of these are in the some location (i.e. side by side inside of TARGETDIR

convert2xml.py
dict0000.dat
pageNNNN.dat

where NNNN is the number of the problem page*.dat file

Then make sure you have cd to the TARGETDIR and then run

convert2xml.py -d dict0000.dat pageNNNN.dat > debug.txt

where again the NNNN is the number of the page file that does not work.

Then look in debug.txt for “Unknown” or any other warning or error message and post at http://darkreverser.wordpress.com/2008/02/13/new-blog/ what it says around that point in the debug.txt file.

djpyle

06-09-2010 07:24 PM

Re: How to convert Topaz ebooks to HTML (Remove DRM from TPZ and AZW1 books for Kindl

Quote:

Originally Posted by jcklaus (Post 18666)

I'm getting this same error trying to remove DRM from a .tpz file. Any ideas? I'm running OSX 10.6.3.

If I try to run it without the PID, I get:

Traceback (most recent call last):
File "./lib/cmbtc_dump.py", line 37, in <module>
from ctypes import windll, c_char_p, c_wchar_p, c_uint, POINTER, byref, \
ImportError: cannot import name windll

Error: File Extraction Failed

djpyle

06-09-2010 08:45 PM

Re: How to convert Topaz ebooks to HTML (Remove DRM from TPZ and AZW1 books for Kindl

Never mind, I got it to work using the command line.

Now my next question: is it possible to generate svg pages where the page takes up the entire screen instead of having the back, forward, and zoom in/out options? If so, you could load the xhtml pages in eCub and generate an ePub that would work perfectly in iBooks. Even with the back/forward in/out aspects, it still works well on iBooks, but those things are just taking up unnecessary space in this particular instance.

I know enough to open gensvg.py in an editor, I just don't know what to take out and what to leave in. Anyone have any tips?

djpyle

06-09-2010 08:51 PM

Re: How to convert Topaz ebooks to HTML (Remove DRM from TPZ and AZW1 books for Kindl

Quote:

Originally Posted by yankgirl013 (Post 19314)

You need to run cmbtc_dump.py or cmbtc_dump_nonK4PC.py first. The former if you purchased the book for Kindle for PC, the latter if you purchased it for a Kindle or iDevice. That removes the DRM. Then you can convert it using the other scripts.

jcklaus

06-12-2010 06:00 PM

Re: How to convert Topaz ebooks to HTML (Remove DRM from TPZ and AZW1 books for Kindl

Quote:

Originally Posted by djpyle (Post 19486)

Could you run through how you used the command line to solve the error I was getting as well? I don't understand. Thanks.

djpyle

06-12-2010 06:27 PM

Re: How to convert Topaz ebooks to HTML (Remove DRM from TPZ and AZW1 books for Kindl

Quote:

Originally Posted by jcklaus (Post 19564)

Could you run through how you used the command line to solve the error I was getting as well? I don't understand. Thanks.

First instead of using the .pyw program, download the regular .py scripts discussed in the original post. I never could get the .pyw to work. You'll have to use Terminal to run the scripts. Once in Terminal, navigate to the folder containing your scripts and type:

Code:

python cmbtc_dump_nonK4PC.py -d -o TARGETDIR -p 12345678 YOURTOPAZBOOKNAMEHERE

where 12345678 should be replaced by the first 8 characters of your PID. You can use something else for "TARGETDIR" if you want, but you don't have to. The only things you have to change are the PID number and the filename. Use the full file name, including the extension (eg Patient-Zero-A-Joe-Ledger-Novel.tpz).

This will create a folder called TARGETDIR (or whatever you replaced that with) in the folder with your script.
Then, still in Terminal, type:

Code:

python gensvg.py TARGETDIR
This is where I stop because the .html the next step creates is very sloppy and doesn't retain italics, but if you want to create an .html, type:

Code:

python genhtml.py TARGETDIR

This creates a "book.html" file in the TARGETDIR folder, which you can then convert to other formats as you see fit.

Hope that helps. Let me know if you have any problems. I'll try to help as best I can.

jcklaus

06-13-2010 12:51 PM

Re: How to convert Topaz ebooks to HTML (Remove DRM from TPZ and AZW1 books for Kindl

Quote:

Originally Posted by djpyle (Post 19568)

First instead of using the .pyw program, download the regular .py scripts discussed in the original post. I never could get the .pyw to work. You'll have to use Terminal to run the scripts. Once in Terminal, navigate to the folder containing your scripts and type:

Code:

python cmbtc_dump_nonK4PC.py -d -o TARGETDIR -p 12345678 YOURTOPAZBOOKNAMEHERE

where 12345678 should be replaced by the first 8 characters of your PID. You can use something else for "TARGETDIR" if you want, but you don't have to. The only things you have to change are the PID number and the filename. Use the full file name, including the extension (eg Patient-Zero-A-Joe-Ledger-Novel.tpz).

This will create a folder called TARGETDIR (or whatever you replaced that with) in the folder with your script.
Then, still in Terminal, type:

Code:

python gensvg.py TARGETDIR
This is where I stop because the .html the next step creates is very sloppy and doesn't retain italics, but if you want to create an .html, type:

Code:

python genhtml.py TARGETDIR

This creates a "book.html" file in the TARGETDIR folder, which you can then convert to other formats as you see fit.

Hope that helps. Let me know if you have any problems. I'll try to help as best I can.

THANK YOU! It worked! This had been driving me crazy and you completely spooked it for me.

jcklaus

06-13-2010 12:53 PM

Re: How to convert Topaz ebooks to HTML (Remove DRM from TPZ and AZW1 books for Kindl

Quote:

Originally Posted by yankgirl013 (Post 19314)

Another user solved this for me. You need to do one additional conversion step in the Terminal using python to turn it into an html file. Type:

python genhtml.py TARGETDIR

Where TARGETDIR is the name of your folder. You'll get an html file called book.html which you can use in calibre

djpyle

06-13-2010 04:40 PM

Re: How to convert Topaz ebooks to HTML (Remove DRM from TPZ and AZW1 books for Kindl

Quote:

Originally Posted by jcklaus (Post 19578)

THANK YOU! It worked! This had been driving me crazy and you completely spooked it for me.

Glad I could help.

jtchris

06-27-2010 01:20 PM

Re: How to convert Topaz ebooks to HTML (Remove DRM from TPZ and AZW1 books for Kindl

I don't suppose somebody could PM me with the latest share where I could find the scripts, eh?

maggerbee

06-28-2010 03:54 PM

Re: How to convert Topaz ebooks to HTML (Remove DRM from TPZ and AZW1 books for Kindl

I think this may be it, I uploaded it for ya.

Code:

http://www.mediafire.com/file/4yhzqzjzjvm/tools.zip

headala

06-30-2010 09:12 PM

Re: How to convert Topaz ebooks to HTML (Remove DRM from TPZ and AZW1 books for Kindl

Is anyone else getting an "Error - -249 outside of string table limits" with this? I get it with both the CLI and the GUI.

headala

07-05-2010 03:19 AM

Re: How to convert Topaz ebooks to HTML (Remove DRM from TPZ and AZW1 books for Kindl

If there is anyone who could please help me with the above error I would appreciate it.

I've now tried it on a 32-bit system as well, with the same results. Perhaps it's an issue with my ebook?

nerfherder

07-21-2010 03:41 PM

Re: How to convert Topaz ebooks to HTML (Remove DRM from TPZ and AZW1 books for Kindl

Quote:

Originally Posted by headala (Post 19925)

Is anyone else getting an "Error - -249 outside of string table limits" with this? I get it with both the CLI and the GUI.

I had a similar error. I've had success before, so I think it only happens with certain books. This version of convert2xml.py worked for me:

Code:

http://pastebin.com/RDERReyk

The version number is the same as the one that failed for me, but this one is clearly an improvement.

headala

07-25-2010 10:16 PM

Re: How to convert Topaz ebooks to HTML (Remove DRM from TPZ and AZW1 books for Kindl

Quote:

Originally Posted by nerfherder (Post 20383)

I had a similar error. I've had success before, so I think it only happens with certain books. This version of convert2xml.py worked for me:

Code:

http://pastebin.com/RDERReyk

The version number is the same as the one that failed for me, but this one is clearly an improvement.

Thank you SO very much for your reply...I was beginning to lose hope that anyone would help!

However, I am still getting the same error. Could you PM me or post the whole topaz tools files you are using? Thanks!

epstewart

08-15-2010 12:56 PM

Re: How to convert Topaz ebooks to HTML (Remove DRM from TPZ and AZW1 books for Kindl

Help! I accidentally obtained a Topaz e-book and, via Amazon Kindle for Mac, downloaded it. I also downloaded it to Kindle for the iPhone. I would like to convert it to an EPUB for iBooks on the iPhone. As I read through the posts in this thread, I find myself getting very confused. Exactly how do I do what I want to do?

I do know how to use the skindle app to unlock the downloaded file (using Parallels emulator software which lets me run Windows on my Mac). The result can be either a compressed .tpz file or an uncompressed one — but neither of those works with calibre! Does unlocking the original file with skindle get me any closer to the result I'm after?

Some of the Python scripts mentioned here seem to assume one has a physical Kindle, for which one needs to specify the PID (whatever that is). How do I go about doing things if all I have are a Mac and an iPhone, with no PID?

I get the feeling I have to convert the downloaded .tpz file to the SVG format. I have no idea what SVG means, though, or why I can't convert the .tpz file directly to an EPUB. I'd like to find out the answers to both those questions.

I also think I understand that the SVG file then has to be converted to HTML, which can at last be input to calibre and converted to EPUB format. Am I right about that? Again, why not just convert the .tpz file directly to (if not EPUB) HTML?

Any help anyone can give me will be much appreciated ...

Stream Recorder

08-15-2010 10:40 PM

Re: How to convert Topaz ebooks to HTML (Remove DRM from TPZ and AZW1 books for Kindl

Quote:

Originally Posted by epstewart (Post 20879)

There is no need to use Parallels Desktop for Max OS X:
http://stream-recorder.com/forum/sho...8&postcount=24

Anonymouslemming

09-10-2010 05:13 PM

Re: How to convert Topaz ebooks to HTML (Remove DRM from TPZ and AZW1 books for Kindl

Quote:

Originally Posted by Stream Recorder (Post 20892)

There is no need to use Parallels Desktop for Max OS X:
http://stream-recorder.com/forum/sho...8&postcount=24

Hi - that post you link to states that you need to know your PID. I can't find that from Kindle4Mac or Kindle4PC. Where would I get that without owning a kindle ?

Also, when I run TopazExtract_Kindle4PC.pyw, I get the following error:

ImportError: cannot import name windll

Stream Recorder

09-11-2010 02:03 AM

Re: How to convert Topaz ebooks to HTML (Remove DRM from TPZ and AZW1 books for Kindl

Quote:

Originally Posted by Anonymouslemming (Post 21362)

Hi - that post you link to states that you need to know your PID. I can't find that from Kindle4Mac or Kindle4PC. Where would I get that without owning a kindle ?

skindle - remove DRM from KindleForPC ebooks (mobi and topaz)
Removing DRM protection from Kindle for PC ebooks using unswindle
DeDRM AppleScript for Mac OS X 10.5, 10.6

ch mn

09-11-2010 10:03 AM

Script bugs?

I finally found the tools to try to convert a topaz book to something more portable (maybe?). The book downloaded to my PC as .azw file along with a .mbp file, but it looked like a topaz book and Skindle identified it as a topaz book. I renamed it Book.tpz for simplicity and ran the following script:

python cmbtc_dump.py -d -o Book Book.tpz

I got the following result:
File "cmbtc_dump.py", line 774
except Exception as message:
^
SyntaxError: invalid syntax

However this script seemed to work:
python cmbtc_dump_nonk4pc.py -d -o Book -p abCdeFgh Book.tpz

Following the instructions, I then ran:
python gensvg.py Book

Which in the end resulted in the following:
page0055.dat
Traceback (most recent call last):
File "gensvg.py", line 405, in <module>
sys.exit(main(''))
File "gensvg.py", line 329, in main
flat_xml = convert2xml.main(pargv)
File "C:\EZSkindle\Topaz\lib\convert2xml.py", line 789, in main
xmlpage = pp.process()
File "C:\EZSkindle\Topaz\lib\convert2xml.py", line 703, in process
tag = self.procToken(self.dict.lookup(v))
File "C:\EZSkindle\Topaz\lib\convert2xml.py", line 439, in procToken
subtagres.append(self.procToken(self.dict.lookup(v al)))
File "C:\EZSkindle\Topaz\lib\convert2xml.py", line 439, in procToken
subtagres.append(self.procToken(self.dict.lookup(v al)))
File "C:\EZSkindle\Topaz\lib\convert2xml.py", line 140, in lookup
print "Error - %d outside of string table limits" % val
TypeError: int argument required

Of course everything after that failed.

Although I am an engineer, I am not a programmer. I don't know if I got a buggy version of the tools or if there is something weird about this book, or if did something wrong. I have not tried another book, as this is the only one I have unless someone can point me a free download that is definitely a topaz book that these tools have successfully been used on.

Stream Recorder

10-05-2010 08:35 AM

Re: How to convert Topaz ebooks to HTML (Remove DRM from TPZ and AZW1 books for Kindl

Quote:

Originally Posted by Chris9 (Post 21963)

I used TopazExtract_Kindle4PC.pyw on a PRC file successfully ... but when I use topazfiles2html.pyw I just get an error message: "HTML conversion failed"

If I use topazfiles2xml.pyw it will work, but is there then a way to convert the xml files into an epub????

Use Calibre for converting XML to ePub or other formats.

Chris9

10-05-2010 09:00 AM

Re: How to convert Topaz ebooks to HTML (Remove DRM from TPZ and AZW1 books for Kindl

Nevermind, I was able to make it work. You have to run the files in this order (for a topaz prc) ...

1. TopazExtract_Kindle4PC
2. TopazFiles2SVG
3. TopazFiles2html

The resulting book.html file was easily exported into Calibre with a metadata TOC intact. On a quick inspection, there may be only minor cleanups needed which I can do in Sigil.

Edit: It wasn't as clean as I thought it would be. There are typical errors you get from OCR books, as this uses an OCR process to convert the books. Topaz is a crappy format, and even on K4PC the book's text doesn't look that great. One problem is that Amazon doesn't specify on its kindle books whether they are topaz format or not. So it's risky buying from them unless you ONLY want to read the book on a kindle device. In this case, for me, I could not find this title in an ebook version anyplace else. So it was worth it for me. But if I have a choice, I will always opt for a non-Amazon ebook first.

All times are GMT -6. The time now is 05:14 AM.

Page 1 of 2

Show 40 post(s) from this thread on one page