WebParser: Using StringIndex2


One of the less obvious features of the WebParser measure is the capability that the option StringIndex2= brings to simplify skins retrieving repetitive data like an RSS or weather feed.

From the Manual:

StringIndex2
The second string index is used when using a RegExp in a measure that uses data from another WebParser measure (i.e. the URL points to a parent measure. In this case the StringIndex defines the index of the result of the parent measure's RegExp and the StringIndex2 defines the index of this measure's RegExp (i.e. it defines the string that the measure returns).

First, here is the entire code of a little skin to demonstrate how to use this feature. Go ahead and create this and run it to see it in action:

[Rainmeter]
Update=1000
DynamicWindowSize=1

[Variables]
Item=.*<item>(.*)</item>
Sub="<![CDATA[":"","]]>":""

[MeasureSite]
Measure=WebParser
Url=http://feeds.gawker.com/lifehacker/full
RegExp="(?siU)<title>(.*)</title>.*<link>(.*)</link>#Item##Item##Item#"

[MeasureMainTitle]
Measure=WebParser
Url=[MeasureSite]
StringIndex=1
Substitute=#Sub#

[MeasureMainLink]
Measure=WebParser
Url=[MeasureSite]
StringIndex=2

[MeasureItem1Title]
Measure=WebParser
Url=[MeasureSite]
RegExp="(?siU)<title>(.*)</title>"
StringIndex=3
StringIndex2=1
Substitute=#Sub#

[MeasureItem1Link]
Measure=WebParser
Url=[MeasureSite]
RegExp="(?siU)<link>(.*)</link>"
StringIndex=3
StringIndex2=1

[MeasureItem2Title]
Measure=WebParser
Url=[MeasureSite]
RegExp="(?siU)<title>(.*)</title>"
StringIndex=4
StringIndex2=1
Substitute=#Sub#

[MeasureItem2Link]
Measure=WebParser
Url=[MeasureSite]
RegExp="(?siU)<link>(.*)</link>"
StringIndex=4
StringIndex2=1

[MeasureItem3Title]
Measure=WebParser
Url=[MeasureSite]
RegExp="(?siU)<title>(.*)</title>"
StringIndex=5
StringIndex2=1
Substitute=#Sub#

[MeasureItem3Link]
Measure=WebParser
Url=[MeasureSite]
RegExp="(?siU)<link>(.*)</link>"
StringIndex=5
StringIndex2=1

[MeterMainTitle]
Meter=String
MeasureName=MeasureMainTitle
X=0
Y=0
FontSize=13
FontColor=255,255,255,255
SolidColor=0,0,0,1
StringStyle=Bold
Antialias=1
LeftMouseUpAction=["[MeasureMainLink]"]

[MeterItem1Title]
Meter=String
MeasureName=MeasureItem1Title
X=0
Y=5R
W=300
H=35
ClipString=1
FontSize=11
FontColor=255,255,255,255
SolidColor=0,0,0,1
Antialias=1
LeftMouseUpAction=["[MeasureItem1Link]"]

[MeterItem2Title]
Meter=String
MeasureName=MeasureItem2Title
X=0
Y=5R
W=300
H=35
ClipString=1
FontSize=11
FontColor=255,255,255,255
SolidColor=0,0,0,1
Antialias=1
LeftMouseUpAction=["[MeasureItem2Link]"]

[MeterItem3Title]
Meter=String
MeasureName=MeasureItem3Title
X=0
Y=5R
W=300
H=35
ClipString=1
FontSize=11
FontColor=255,255,255,255
SolidColor=0,0,0,1
Antialias=1
LeftMouseUpAction=["[MeasureItem3Link]"]

So let's tear this apart and see what it is doing:

First, we set a variable that simply gets everything in between <item> and </item> from the site.

[Variables]
Item=.*<item>(.*)</item>

Then we use WebParser to go get the information from the site.

[MeasureSite]
Measure=WebParser
Url=http://feeds.gawker.com/lifehacker/full
RegExp="(?siU)<title>(.*)</title>.*<link>(.*)</link>#Item##Item##Item#"

Notice we are doing something a bit different here. Instead of fully parsing the site into many, many StringIndexes on the main measure, we are just getting the main title and link for the site into StringIndexes 1 and 2, then getting the entire contents of each <item> into StringIndexes 3, 4 and 5.

Now let's skip the measures for the main title and link, they are done the normal way. Let's move down to this measure:

[MeasureItem1Title]
Measure=WebParser
Url=[MeasureSite]
RegExp="(?siU)<title>(.*)</title>"
StringIndex=3
StringIndex2=1
Substitute=#Sub#

What we are doing is using the StringIndex2= feature to get the <title> of the first <item> from the feed. We define StringIndex=3, which will return the entire first <item> from the main WebParser measure at the top. Then, unlike a normal "child" measure in WebParser, we use a RegExp= to parse the information in that StringIndex=3 and return the single bit of information we want from that <item> in a StringIndex2=1 on this measure.

Remember, the way StringIndex2 works is that IF a WebParser measure is using an earlier WebParser measure as the Url= AND the measure has a RegExp= on it, then the RegExp= is applied to the data in StringIndex, and the result is returned in StringIndex2.

The "value" of the measure, when used for instance in a MeasureName= option in a meter, is what is returned in StringIndex2.

We can repeat this process to get the link of the first item:

[MeasureItem1Link]
Measure=WebParser
Url=[MeasureSite]
RegExp="(?siU)<link>(.*)</link>"
StringIndex=3
StringIndex2=1

Note that once again we are getting the entire <item> data from StringIndex=3 of the main measure, then using a RegExp= to get just the link from that information into StringIndex2=1 on this measure.

It might be obvious by now, but when we want to carry on with the second <item> from the feed, we use:

[MeasureItem2Title]
Measure=WebParser
Url=[MeasureSite]
RegExp="(?siU)<title>(.*)</title>"
StringIndex=4
StringIndex2=1
Substitute=#Sub#

So now we are getting the information for the entire second <item> from the main measure, which is in StringIndex=4 on that measure. Then we parse it with a RegExp= and the result is returned in the StringIndex2=1 on this measure.

Play with this a bit, and you might find that it can save a lot of extra parsing and creating huge RegExp= options to parse absolutely everything out of a site all at once up front. Get the main site in big chunks, then deal with the chunks as if they were sites in and of themselves.

Hint: This can be a nice way to get around the limit of 99 StringIndex numbers on any single WebParser measure...