Adding a Dynamic Robots.txt file to an ASP.NET MVC site
Robots.txt is required to allow search engines to properly index your site, and more importantly not index it. If you have a public-facing staging or preliminary site that you don’t want to show up in Google results, you need to make sure that it returns the correct robots.txt with the
Disallow: /
line to prevent indexing. However, manually adding robots.txt files to staging and production environments as a manual process can be improved with the process below – the same code can serve up a locked down robots.txt in staging or internal URLs, and allow indexing in production.
First, add a route that responds to /robots.txt in either Global.asax.cs or RouteConfig.cs before your Default routes:
routes.MapRoute(
"Robots.txt",
"robots.txt",
new
{
controller = "Robots",
action = "RobotsText"
}
);
You’ll also need to make sure that runAllManagedModulesForAllRequests is true in web.config as normally text files bypass the ASP.NET pipeline:
<system.webServer>
<modules runAllManagedModulesForAllRequests="true"></modules>
...
</handlers>
</system.webServer>
The create a new controller called “RobotsController” with a single action “RobotsText”. All requests to /robots.txt will go here:
public class RobotsController : Controller
{
public FileContentResult RobotsText()
{
var contentBuilder = new StringBuilder();
contentBuilder.AppendLine("User-agent: *");
// change this to however you want to detect a production URL
var isProductionUrl = Request.Url != null && !Request.Url.ToString().ToLowerInvariant().Contains("elasticbeanstalk");
if (isProductionUrl)
{
contentBuilder.AppendLine("Disallow: /elmah.axd");
contentBuilder.AppendLine("Disallow: /admin");
contentBuilder.AppendLine("Disallow: /Admin");
contentBuilder.AppendLine("Sitemap: http://www.mysite.com/sitemap.xml");
}
else
{
contentBuilder.AppendLine("Disallow: /");
}
return File(Encoding.UTF8.GetBytes(contentBuilder.ToString()), "text/plain");
}
}
You can try a number of ways of detecting a production environment, from the naïve URL checking above to environment variables in your application container.