Originally Posted By: hybrid8
Nonono... I absolutely loath bash. Hadn't I made that clear?


Me too. No offense to anyone who contributed to that script, but it's ugly and cryptic.

I wanted a similar script after reading this thread, so I decided to write my own during my lunch break. Here it is.

Why do I like mine better? Well it's a whole 27 lines shorter wink Also it doesn't have any external dependencies like xmlstarlet or awk or wget. And it's cross-platform; it'll run on Windows, Mac OS X, Linux, Solaris... I also find it to be much easier to read, which means much easier to edit and maintain.

What do you need? You need a Tcl interpreter. If you're on Mac OS X or Linux then you're all set because Tcl should already be installed. If you're on Windows, install Tcl: http://www.activestate.com/activetcl/downloads/

The destination folder for the trailers ($TargetDir) is set to the current working directory by default. So if you put the script in C:\whatever\trailers then you would do this:

> cd C:\whatever\trailers
> tclsh
% source GetTrailers.tcl

The files are organized by each movie title being the name of a folder which contains the trailer, large & extra large poster images, and a movieinfo.xml file that contains the relevant xml data pertaining to that movie (so that you have all the good info in there for some future use).


Code:
#! /usr/bin/tclsh

# Location of the raw XML movie index.
set FeedsURL "http://www.apple.com/trailers/home/xml/current.xml"

# Download to the current directory.
set TargetDir [pwd]

# We'll use this global variable for the raw XML.
set FeedsXML ""

# And this will be for our organized movie data.
array set Movies [list]

# Load this standard Tcl package.
package require http


# Parses all relevant data for the next listed movie in the XML data, starting at the specified character index.
proc parseNextMovie {index} {
	global FeedsXML Movies

	set startIndex 	[string first {<movieinfo id="} $FeedsXML $index]
	set endIndex 	[string first {</movieinfo>} $FeedsXML $index]
	incr endIndex 11
	set xml 		[string range $FeedsXML $startIndex $endIndex]
	
	if { $startIndex == -1 } {
		# There are no more movies to be parsed
		return -1
	}
	
	# Parse the movie title.
	set index [string first {<title>} $xml]
	incr index 7
	set end [string first {</title>} $xml $index]
	incr end -1
	set title [cleanTitle [string range $xml $index $end]]
	
	# Parse the large movie poster URL.
	set index [string first {<poster><location>} $xml]
	incr index 18
	set end [string first {</location>} $xml $index]
	incr end -1
	set posterLargeURL [string range $xml $index $end]
	
	# Parse the extra large movie poster URL.
	set index [string first {<xlarge>} $xml]
	incr index 8
	set end [string first {</xlarge>} $xml $index]
	incr end -1
	set posterXLargeURL [string range $xml $index $end]
	
	# Parse the trailer URL.
	set index [string first {<preview>} $xml]
	set index [string first {">} $xml $index]
	incr index 2
	set end [string first {</} $xml $index]
	incr end -1
	set trailerURL [string range $xml $index $end]
	
	# Save all this info in our Movies array.
	set Movies($title) [list $xml $posterLargeURL $posterXLargeURL $trailerURL]
	
	# Return the ending character index for this movie within $FeedsXML.
	return $endIndex
}

# Downloads the specified movie trailer and posters.
proc downloadMovie {title} {
	global Movies ProgressBar TargetDir
	
	if { ![info exists Movies($title)] } {
		return
	}
	
	set xml 			[lindex $Movies($title) 0]
	set posterLargeURL 	[lindex $Movies($title) 1]
	set posterXLargeURL	[lindex $Movies($title) 2]
	set trailerURL 		[lindex $Movies($title) 3]
	
	# Download the posters.  
	# Use [catch] just in case the URLs are bad, which they would be if Apple 
	# didn't provide posters for a certain movie for some reason.
	set fileToken [open temp_poster_l w]
	fconfigure $fileToken -translation binary
	catch { 
		set httpToken [http::geturl $posterLargeURL -channel $fileToken]
		http::cleanup $token
	}
	close $fileToken
	
	set fileToken [open temp_poster_xl w]
	fconfigure $fileToken -translation binary
	catch { 
		set httpToken [http::geturl $posterXLargeURL -channel $fileToken]
		http::cleanup $token
	}
	close $fileToken
	
	# Download the trailer.
	set ProgressBar -1
	set fileToken [open temp_trailer w]
	fconfigure $fileToken -translation binary
	catch { 
		set httpToken [http::geturl $trailerURL -channel $fileToken -progress downloadProgress]
		http::cleanup $token
	}
	close $fileToken
	
	# Create a new directory for our freshly downloaded movie.
	set dir $TargetDir/$title
	file mkdir $dir
	
	# Move all of our movie files into this directory.
	file rename temp_poster_l 	$dir/poster_l.[file extension $posterLargeURL]
	file rename temp_poster_xl 	$dir/poster_xl.[file extension $posterXLargeURL]
	file rename temp_trailer 	$dir/[file tail $trailerURL]
	
	# Save the xml data pertaining to this movie as movieinfo.xml.
	set token [open $dir/movieinfo.xml w]
	puts $token $xml
	close $token

	return
}

# Callback procedure for downloads that keeps us informed of the download progress.
proc downloadProgress {token total current} {
	global ProgressBar
	
	# Initiate ProgressBar if necessary.
	if { $ProgressBar < 0 } {
		set ProgressBar 0
		puts "<------------------>"
		flush stdout
	}
	
	# Calculate the number of progress bars that should be displayed.
	set bytesPerBar [expr { 1.0 * $total / 20 }]
	set bars [expr { int($current / $bytesPerBar) }]
	while { $ProgressBar < $bars } {
		puts -nonewline "|"
		flush stdout
		incr ProgressBar
	}
	return
}

# Replaces undesirable or incompatible characters with friendlier ones.
proc cleanTitle {title} {
	set title [string map {
		&gt;		>
		&lt;		<
		&quot;		\"
		&rdquo;		\"
		&bdquo;		\"
		&lsquo;		\"
		&rsquo;		\"
		&sbquo; 	,
		&amp; 		&
		>		)
		<		(
		:		-
		/		-
		\\		-
		?		""
		|		-
		*		+
	} $title]

	return $title
}


# Download the movie index.
set token [http::geturl $FeedsURL]
set FeedsXML [encoding convertfrom utf-8 [http::data $token]]
http::cleanup $token

# Loop through the XML, parsing movie data until there are no more movies to parse.
set index 0
while { $index > -1 } {
	set index [parseNextMovie $index]
}

# Let's see which movies we already have downloaded and remove them from our Movies array.
# We'll see what directories are in our $TargetDir, and assume each is the name of a movie.
foreach file [glob -directory $TargetDir -nocomplain -tails *] {
	if { [file isdirectory $file] } {
		if { [info exists Movies($file)] } {
			unset Movies($file)
		}
	}
}

# Now our Movies array only contains movies which haven't been downloaded yet.  Let's download them one by one.
set titles [lsort -dictionary -increasing [array names Movies]]
set count 0
foreach title $titles {
	incr count
	puts "\nDownloading $count/[llength $titles] \"$title\""
	downloadMovie $title
}


Attachments
GetTrailers.tcl (159 downloads)