<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>innerlogics/blog &#187; regex</title>
	<atom:link href="http://www.innerlogics.com/blog/tag/regex/feed/" rel="self" type="application/rss+xml" />
	<link>http://www.innerlogics.com/blog</link>
	<description>niv singer's rants</description>
	<lastBuildDate>Wed, 14 Apr 2010 09:34:45 +0000</lastBuildDate>
	<generator>http://wordpress.org/?v=2.9.2</generator>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
			<item>
		<title>Cleaning Flickr Tags</title>
		<link>http://www.innerlogics.com/blog/2008/11/cleaning-flickr-tags/</link>
		<comments>http://www.innerlogics.com/blog/2008/11/cleaning-flickr-tags/#comments</comments>
		<pubDate>Sat, 01 Nov 2008 23:16:17 +0000</pubDate>
		<dc:creator>nivs</dc:creator>
				<category><![CDATA[code]]></category>
		<category><![CDATA[flickr]]></category>
		<category><![CDATA[java]]></category>
		<category><![CDATA[regex]]></category>
		<category><![CDATA[tags]]></category>

		<guid isPermaLink="false">http://www.innerlogics.com/blog/?p=31</guid>
		<description><![CDATA[As part of an application I&#8217;m developing, I needed to store tags from multiple sources, and I chose to use Flickr&#8217;s method of storing raw and clean tags. I needed to figure out how Flickr converts raw tags to clean ones. This article by Terrell Russell helped a lot, but missed a few elements (and I [...]]]></description>
			<content:encoded><![CDATA[<p>As part of an application I&#8217;m developing, I needed to store tags from multiple sources, and I chose to use Flickr&#8217;s method of storing <em>raw</em> and <em>clean</em> tags. I needed to figure out how Flickr converts raw tags to clean ones. <a href="http://weblog.terrellrussell.com/2007/06/clean-and-store-your-raw-tags-like-flickr/">This article</a> by Terrell Russell helped a lot, but missed a few elements (and I needed it in Java).</p>
<p>The original regular expression by Russell did not include a comma, and I also found out certain special characters are substituted (I guess I will find more of them as I keep comparing Flickr tags).</p>
<pre class="brush: java;">
public static String cleanRawTag(String raw, boolean isMachineTag)
{
    	if(isMachineTag)
    	{
    		// raw  = geo:lat=13.751193
    		// name = geo:lat=13751193
    		int equals = raw.indexOf('=');
    		return raw.substring(0, equals+1).toLowerCase() + cleanRawTag(raw.substring(equals+1), false);
    	}
    	else
    	{
    		String clean = raw.replaceAll(&quot;[s&quot;!@#$%^&amp;*():,-_+='/.;`&lt;&gt;[]?\]&quot;, &quot;&quot;).toLowerCase();
    		return clean.replace('ß', 's').replace('ς', 'σ');
    	}
}
</pre>
]]></content:encoded>
			<wfw:commentRss>http://www.innerlogics.com/blog/2008/11/cleaning-flickr-tags/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
	</channel>
</rss>
