---
Notes to self:
Parsing Xanga "friends" page [ http://www.xanga.com/Private/subs.aspx ].
- Main Date tags are found at "blogheader". That's where I'll have to separate things, assuming I check daily for updates. This will separate things by day, assuming I get it right.
- a href=\"/home.aspx?user= will find each user's entry. No, that's not true. I should find class="blogbody" first, which marks the table that begins the username (Why didn't they include this in the main entry? Weird people...) followed by a href=\"/home.aspx?user=, from which I will extract the username. Then I have to find class="blogbody" (I really ought to be able to count the number of characters between the first and the second class="blogbody" to verify where I am. I assume these ugly tables are script generated, so they shouldn't be of variable length.
- Everything between <td width="5%"> </td><td valign="top"> and </td></tr><tr><td width="5%"> </td> should mark out the entry.
- span class="smalltext" to the first </a> marks out the read/post comment link for the entry.
...On a side note, Xanga returns html in one squish. No return/new lines. So the code looks like:
<table border="0" cellspacing="0" cellpadding="4" width="100%"><tr><td valign="top"> </td><td align="right"><span class="smalltext">browse subscriptions: <a href="subs.aspx?nextdate=4%2f28%2f2004+20%3a39%3a30.810&direction=n">Next »</a></span></td></tr></table><table border="1" cellspacing="0" cellpadding="1" width="100%" class="tabs"><tr><td width="3"> </td><td width="130" align="center" id="tabselected"><a href="subs.aspx" class="tabselected">Public Posts</a></td><td width="5"> </td><td width="125" align="center" id="tab"><a href="subsprotected.aspx" class="tab">Protected Posts</a></td><td> </td></tr></table><div class="blogheader">Thursday, April 29, 2004</div><table border="0" cellspacing="0" cellpadding="1" width="100%" class="blogbody"><tr><td width="5%"> </td><td valign="top"><a href="/home.aspx?user=whoa__now"><b>whoa__now</b></a></tr></table><table border="0" cellspacing="0" cellpadding="1" width="100%" class="blogbody"><tr><td width="5%"> </td><td valign="top">someone tell me to ask him out.</td></tr><tr><td width="5%"> </td><td><span class="smalltext"><a href="http://www.xanga.com/item.aspx?user=whoa__now&tab=weblogs&uid=84779473">4:20 PM</a> - <a href="http://www.xanga.com/item.aspx?user=whoa__now&tab=weblogs&uid=84779473">add eprops</a> - <a href="http://www.xanga.com/item.aspx?user=whoa__now&tab=weblogs&uid=84779473">add comments</a> - <a href="/send.aspx?uid=84779473&tab=weblogs">email it</a></span></td></tr></table...
Damn ugly. ::obsessively goes in and adds newlines so it's easier to read::
---
It'll be opt-in, in a way. Xanga users will have to access a page similar to this one [ http://darwin.servehttp.com/ibcorner-addafriend.bml ] where they will be able to give their xanga username and be added as a subscriber to the account. 'cept the difference is that the ibcorner-addafriend.bml is so that people can see the friends-only posts... and in this case, it's to let the people's posts get sent over... I might have to add a password system to let people control adding and deleting. A two-layer password system, similar to the college/university page. One password to edit, one personal password to make changes.
Right. Anyways, expected work time: probably a night or two. Parsing should be simple. Posting to LiveJournal, I already do. The biggest difficult is the testing...
And another strange thing... Does Xanga show all of the recent posts for that day? Because it feels like there should be a lot more entries there..............