14 August 2011 by Stuart Cam
There is a well known technique within black-hat SEO called content spinning. Content spinning involves writing an article using a specific syntax which allows a text processor to generate random variations. These variations can then be used for multiple purposes without fear of being labelled as duplicate content.
I was searching for a C# algorithm which I could re-purpose into a nAnt script, but a few searches on the internet gave no results. I found a python version which seemed good enough, so I rewrote it for C#.
public static class Spinner
{
private readonly static Random Randomizer = new Random();
public static string Spin(string content)
{
const char OPEN_BRACE = '{';
const char CLOSE_BRACE = '}';
const char DELIMITER = '|';
var start = content.IndexOf(OPEN_BRACE);
var end = content.IndexOf(CLOSE_BRACE);
if (start == -1 && end == -1 || start == -1 || end < start)
{
return content;
}
if (end == -1)
{
throw new ArgumentException("Unbalanced brace.");
}
var substring = content.Substring(start + 1, content.Length - (start + 1));
var rest = Spin(substring);
end = rest.IndexOf(CLOSE_BRACE);
if (end == -1)
{
throw new ArgumentException("Unbalanced brace.");
}
var splits = rest.Substring(0, end).Split(DELIMITER);
var item = splits[Randomizer.Next(0, splits.Length)];
return content.Substring(0, start) + item + Spin(rest.Substring(end + 1, rest.Length - (end + 1)));
}
}
Usage:
Spinner.Spin("{Hi|Hello|Good morning}, my name is {Matt|Bob} and I {certainly |might |should |}have {something {important|special} to {say|eat|share}|a favorite {toy|book|poem|song}}.");