Pages: [1]
|
|
|
|
Author
|
Topic: Fun little feature (Read 81 times)
|
|
Terraji
Admin Team CSR Connoisseur
Karma: +35/-15
Offline
Gender:
Posts: 789
|
|
Fun little feature
« on: July 27, 2006, 10:19:38 PM »
|
|
This post is a test,
I will post my explaination in a minute or two.
|
|
Logged
|
|
|
|
Terraji
Admin Team CSR Connoisseur
Karma: +35/-15
Offline
Gender:
Posts: 789
|
|
Re:Fun little feature
« Reply #1 on: July 27, 2006, 10:22:39 PM »
|
|
yay it worked!
In the last two hours or so I embarked on figuring out a little RSS, and for fun I whipped up a little script that may potentially be useful to some of you.
My script scrapes the html off the csreloaded front page for the "Recent Posts" section and makes a cute little RSS feed out of it. The html is grabbed every minute (is that too often?) and a new .xml file is uploaded for you RSS reader to grab.
Here is the link: http://www.ece.ualberta.ca/~jarret/csrrss.xml
To use in firefox, go: Bookmarks-> Manage Bookmarks, File -> New Live Bookmark and paste the link into the Feed Location field. This will eliminate the need to be constantly hitting the front page to check for new posts.
I am pretty sure it violates the rules for utilizing university bandwith, but I am sure I can get away with it for at least a while before somebody notices and/or cares.
This raises the question: Does somebody have a machine (like icabbit, if it is still around) where could be done more permanantly? Alternatively, Porter could just write a little PHP on the site to do the same task much more elegantly.
Here are the details:
in my crontab:
*/1 * * * * jarret /home/jarret/csrrss/genrss.sh
|
|
genrss.sh:
#!/bin/sh export PWD="/home/jarret/csrrss/" wget http://www.csreloaded.com -O /tmp/main.html
cat /tmp/main.html | perl -ne "print if(\$. >`cat -b /tmp/main.html |grep "\-\- Recent Posts \-\-" | awk '{print $1}'` && \$. < `cat -b /tmp/main.html |grep "\-\- HLstats Top 5 \-\-" | awk '{print $1}'`)" | sed "s@<br />@\n@g" > /tmp/html.tmp
cat /tmp/html.tmp |grep "href" |tr '"' ' ' |awk '{print $3}' > /tmp/links.tmp cat /tmp/html.tmp |grep "href" > /tmp/titles.tmp cat /tmp/html.tmp |grep "By" |awk '{print $3}' > /tmp/names.tmp cat /tmp/html.tmp |grep "On\|Today" > /tmp/pdate.tmp cat /tmp/html.tmp |grep "Time" |tr '>' '\n'|tr '<' '\n' |grep "AM\|PM" > /tmp/ptime.tmp
ruby /home/jarret/csrrss/buildrss.rb > /tmp/csrrss.xml rm -rf /tmp/html.tmp /tmp/links.tmp /tmp/titles.tmp /tmp/names.tmp /tmp/pdate.tmp /tmp/ptime.tmp /tmp/main.html scp /tmp/csrrss.xml lab.jarret.ca:web-docs/
|
|
and the ruby script (I should really learn some perl someday ) buildrss.rb
def extract_title(input) the_start = 82 the_end = input.index('<',82) input.slice(the_start, the_end - the_start).chomp end def clean_date(input) the_start = 2 the_end = input.index(',',2) input.slice(the_start, the_end - the_start).chomp end
linkf = File.new('/tmp/links.tmp') namesf = File.new('/tmp/names.tmp') pdatef = File.new('/tmp/pdate.tmp') ptimef = File.new('/tmp/ptime.tmp') titlesf = File.new('/tmp/titles.tmp')
linklines = Array.new namelines = Array.new ptimelines = Array.new pdatelines = Array.new titlelines = Array.new
titles = Array.new links = Array.new
begin while (line = linkf.readline) do linklines << line end rescue EOFError linkf.close end begin while (line = namesf.readline) do namelines << line end rescue EOFError namesf.close end begin while (line = ptimef.readline) do ptimelines << line end rescue EOFError ptimef.close end begin while (line = pdatef.readline) do pdatelines << line end rescue EOFError pdatef.close end begin while (line = titlesf.readline) do titlelines << line end rescue EOFError titlesf.close end
0.upto(4) do |i| titles << clean_date(pdatelines[i].chomp) + ", " + ptimelines[i].chomp + " - " + namelines[i].chomp + " - " + extract_title(titlelines[i]) links << "http://www.csreloaded.com" + linklines[i].chomp end
puts "<rss version=\"0.91\">" puts "<channel>" puts "<title>CSReloaded.com Recent Posts</title>" puts "<link>http://www.csreloaded.com</link>" puts "<description>recent posts in the csr board</description>" puts "<language>en-us</language>" 0.upto(4) do |i| puts "<item>" puts "<title>#{titles[i].to_s}</title>" puts "<link>#{links[i].to_s}</link>" puts "<description>nil</description>" puts "</item>" end puts "</channel>" puts "</rss>"
|
|
comments and suggestions are welcome.
|
|
Logged
|
|
|
|
Porter
[Wumpa]
Board Admin
Karma: +176/--88
Offline
Gender:
Posts: 3910
|
|
Re:Fun little feature
« Reply #2 on: July 29, 2006, 07:06:57 AM »
|
|
Yes, I can work on that. It actually was originally part of the site when Ryo built it, but it's never been maintained or advertised.
(If you look at the HTML source for the homepage actually, you can see under the <!-- Donations and XML --> section that there is a commented-out link to "xml.php". Currently it gives a 404, but I know it's around here still somewhere.)
EDIT:: Found it!
I'll probably put feeds up for the frontpage news, and the recent posts unless anyone has better suggestions.
It probably won't be this weekend, but I'll try to get to it one night this coming week.
Neat script Terraji!
|
|
Logged
|
[Wumpa] Porter --Silent, professional, lethal... sometimes.
|
|
|
Terraji
Admin Team CSR Connoisseur
Karma: +35/-15
Offline
Gender:
Posts: 789
|
|
Re:Fun little feature
« Reply #3 on: July 29, 2006, 12:31:24 PM »
|
|
how about merging them into one? You could just make the posts from the news section appear in the recent posts and use that feed.
|
|
Logged
|
|
|
|
Porter
[Wumpa]
Board Admin
Karma: +176/--88
Offline
Gender:
Posts: 3910
|
|
Re:Fun little feature
« Reply #4 on: July 29, 2006, 06:27:48 PM »
|
|
Yeah I could do that.
What I really need is advice on the TYPE of feed to supply. I haven't kept up with RSS technology or protocols since 2002 really. Somebody want to link me to a protocol spec? What do that have these days? ATOM? RSS v x.x? RDF?
What the hell are these, and which one do you want me to re-implement?
|
|
Logged
|
[Wumpa] Porter --Silent, professional, lethal... sometimes.
|
|
|
Porter
[Wumpa]
Board Admin
Karma: +176/--88
Offline
Gender:
Posts: 3910
|
|
Re:Fun little feature
« Reply #5 on: July 29, 2006, 06:36:43 PM »
|
|
Nevermind, I'm a quick study.
http://blogs.law.harvard.edu/tech/rss
I found a PHP class that should do nicely to produce output in any version at runtime. I'll keep you all posted.
|
|
Logged
|
[Wumpa] Porter --Silent, professional, lethal... sometimes.
|
|
|
Terraji
Admin Team CSR Connoisseur
Karma: +35/-15
Offline
Gender:
Posts: 789
|
|
Re:Fun little feature
« Reply #6 on: July 30, 2006, 02:59:33 PM »
|
|
here's another idea:
put the server status as the first (or last) entry in the feed.
|
|
Logged
|
|
|
|
Porter
[Wumpa]
Board Admin
Karma: +176/--88
Offline
Gender:
Posts: 3910
|
|
Re:Fun little feature
« Reply #7 on: July 31, 2006, 07:13:45 AM »
|
|
here's another idea:
put the server status as the first (or last) entry in the feed.
|
|
Ooooo, I like that!
I built out most of the framework for the script that will parse the parameters and supply a feed based on them, and I'm working on the DB queries and the RSS specific stuff now. Once I have something working I'll move on to adding that.
Awesome idea.
|
|
Logged
|
[Wumpa] Porter --Silent, professional, lethal... sometimes.
|
|
|
Pages: [1]
|
|
|
|
|
|
CSReloaded Forums | Powered by YaBB SE
© 2001-2003, YaBB SE Dev Team. All Rights Reserved. |
|
|