View Single Post
  #2  
Old 11-04-2012, 12:01 AM
mcgiorda mcgiorda is offline
Junior Member
 
Join Date: Nov 2012
Posts: 1
mcgiorda is on a distinguished road
Default

Re: How can I batch download Crunchyroll XML subtitles?


Hello!

I have a script that download all subtitles (all languages) from an episode.
You could modify the script to download several links.

The script consist in two languages: PHP & Python

I'm assuming that you have an apache server installed with PHP extension enabled. If you don't know how to install, google it.
TIP: Install XAMPP (good for newbies)

I'm also assuming that you have a python interpreter installed. If you don't, get it here: http://www.python.org/ftp/python/2.7.3/python-2.7.3.msi

1. Create a PHP page called index.php

2. Code for index.php:
Code:
<!DOCTYPE html>
<html>
    <head>
        <title>XML PARSER Crunchyroll</title>
        <meta http-equiv="Content-Type:text/html" charset="utf-8">
    </head>
    <body>
        <h1>SUBTITLES GRABBER FOR CRUNCHYROLL</h1>
        <div>
            <form name="getASS" method="GET" action="parser.php">
                <input type="text" name="url" size="80" placeholder="Place your url here"><br>
                <input type="text" name="proxy" size="80" placeholder="Place your proxy (IF NECESSARY) here. ex: 96.47.230.49:8080">
                 Only necessary if you are trying to download a limited region anime subtitle<br>
                <input type="submit" value="Get Subtitles">
            </form>
            <span style="color:red">
                <p>!Attention!</p>
                <p>Since this script gets ALL availables subtitles, it may take a while.</p>
                <p>Don't get hurry, it take in max 15 seconds.</p>
            </span>
            <span style="color:gray">
                <p>!Atenç?o!</p>
                <p>Como o script baixa TODAS as legendas disponíveis, ele pode demorar um pouco.</p>
                <p>N?o se apresse, vai demorar no máximo 15 segundos.</p>
            </span>
        </div>
    </body>
</html>
3. Create a PHP page called parser.php

4. Code for parser.php: * Here is the file you need to adjust to your needs.
If you can't, ask me, I'll try for you.
Code:
<?php
header("Content-Type:text/html ; charset=UTF-8");
include('simple_html_dom.php');	

if(!isset($_GET["url"])){
	die("Sem URL!!");
}
if(!empty($_GET["proxy"])){
	$aContext = array(
	    'http' => array(
	        'proxy' => 'tcp://'.$_GET["proxy"],
	        'request_fulluri' => true,
	    ),
	);
	$cxContext = stream_context_create($aContext);

	$html = file_get_html($_GET["url"],false,$cxContext);
}
else
	$html = file_get_html($_GET["url"]);

$languages = $html->find('div[id=showmedia_about_info_details] div span a');

$aux_title = @split("/",$_GET["url"]);
$title = $aux_title[3];
$aux_title_ = @split("-",$aux_title[4]);
$title .= "-".$aux_title_[0]."-".$aux_title_[1];

echo "<h1>Legendas para ".$aux_title[3]." - episódio ".$aux_title_[1]."</h1>";
echo "<h1>Subtitles for ".$aux_title[3]." - episode ".$aux_title_[1]."</h1>";
$i=0;

while(@$languages[$i]){

	if($languages[$i]->plaintext != "English (US)" && $languages[$i]->plaintext != "Espa?ol" && $languages[$i]->plaintext != "Français (France)" && $languages[$i]->plaintext != "Portugu?s (Brasil)"){
		$i++;
		continue;
	}

	$l_title = $title."-".str_replace(" ", "-", $languages[$i]->plaintext);

?>
<a href="legendas/<?=utf8_encode($l_title).".ass"?>"><?=$l_title?></a><br>
<?php

	if(file_exists($l_title.".ass")){
		$i++;
		continue;
	}

	$id = @split("=",$languages[$i]->href);
	$id = $id[1];
	$command = __DIR__."\crunchy_xml_decoder\decode.py ".$_GET["url"]." \"".$languages[$i]->plaintext."\" ".$l_title." ".$id;
	$temp = exec($command);
	$i++;

}
if($i == 0){
		echo "Try using a proxy!<br>Tente utilizar um proxy!<br>";
}
?>
<br>
<a href="index.php"><button>Back / Voltar</button></a>
5. Download the ZIP package here (contains all python scripts): http://svgen.com/crunchy_xml_decoder.zip

6. Extract at the same root as index.php

Example:

|--/index.php
|--/parser.php
|--/crunchy_xml_decoder/--decode.py
|--/crunchy_xml_decoder/--setup.py
.
.
.

7. Create a folder called legendas

Example:

|--/index.php
|--/parser.php
|--/legendas/
|--/crunchy_xml_decoder/--decode.py
|--/crunchy_xml_decoder/--setup.py
.
.
.

8. Open your localhost and enjoy! (ie: http://localhost:80/)
Probably the port will be 80

If you need help just ask, I'll try to explain better!
Reply With Quote