Monthly Archives: January 2014

Statically Compiled LINQ Queries Are Broken In .NET 4.0

Written by William Roush on January 19, 2014 at 5:31 pm

Diving into how a minor change in error handling in .NET 4.0 has broken using compiled LINQ queries as per the MSDN documentation.

Query was compiled for a different mapping source than the one associated with the specified DataContext.

When working on high performing LINQ code this error can cause a massive amount of headaches. This StackOverflow post blames the problem on using multiple LINQ mappings (which the same mappings from different DataContexts will count as "different mappings"). In the example below, we’re going to use the same mapping, but different instances which is extremely common for short-lived DataContexts (and reusing DataContexts come with a long list of problematic side-effects).

namespace ConsoleApplication1
{
    using System;
    using System.Data.Linq;
    using System.Linq;

    class Program
    {
        protected static Func<MyContext, Guid, IQueryable<Post>> Query =
            CompiledQuery.Compile<MyContext, Guid, IQueryable<Post>>(
                (dc, id) =>
                    dc.Posts
                        .Where(p => p.AuthorID == id)
            );

        static void Main(string[] args)
        {
            Guid id = new Guid("340d5914-9d5c-485b-bb8b-9fb97d42be95");
            Guid id2 = new Guid("2453b616-739f-458f-b2e5-54ec7d028785");

            using (var dc = new MyContext("Database.sdf"))
            {
                Console.WriteLine("{0} = {1}", id, Query(dc, id).Count());
            }

            using (var dc = new MyContext("Database.sdf"))
            {
                Console.WriteLine("{0} = {1}", id2, Query(dc, id2).Count());
            }

            Console.WriteLine("Done");
            Console.ReadKey();
        }
    }
}

This example follows MSDN’s examples, yet I’ve seen people recommending you do this to resolve the changes in .NET 4.0:

protected static Func<MyContext, string, IQueryable<Post>> Query
{
    get
    {
        return
            CompiledQuery.Compile<MyContext, string, IQueryable<Post>>(
                 (dc, id) =>
                    dc.Posts
                        .Where(p => p.AuthorID == id)
            );
    }
}

Wait a second! I’m recompiling on every get, right? I’ve seen claims it doesn’t. However peeking at the IL code doesn’t hint at that, the process is as follows:

  • Check if the query is assignable from ITable, if so let the Lambda function compile it.
  • Create a new CompiledQuery object (just stores the Lambda function as a local variable called “query”).
  • Compile the query using the provider specified by the DataContext (always arg0).

At no point is there a cache check, the only place a cache could be placed is in the provider (which SqlProvider doesn’t have one), and that would be a complete maintenance mess if it was done that way.

Using a test application (code is available at https://bitbucket.org/StrangeWill/blog-csharp-static-compiled-linq-errors/, use the db.sql file to generate the database, please use a local installation of MSSQL server to give the best speed possible so that we can evaluate query compilation times), we’re going to force invoking the CompiledQuery.Compile method on every iteration (10,000 by default) by passing in delegates as opposed to passing in the resulting compiled query.

QueryCompiled Average: 0.5639ms
QueryCompiledGet Average: 1.709ms
Individual Queries Average: 2.1312ms
QueryCompiled Different Context (.NET 3.5 only) Average: 0.6051ms
QueryCompiledGet Different Context Average: 1.7518ms
Individual Queries Different Context Average: 2.0723ms

We’re no longer seeing the 1/4 the runtime you get with the compiled query. The primary problem lies in this block of code found in CompiledQuery:

if (context.Mapping.MappingSource != this.mappingSource)
{
	throw Error.QueryWasCompiledForDifferentMappingSource();
}

This is where the CompiledQuery will check and enforce that you’re using the same mapper, the problem is that System.Data.Linq.Mapping.AttributeMappingSource doesn’t provide an Equals override! So it’s just comparing whether or not they’re the same instance of an object, as opposed to them being equal.

There are a few fixes for this:

  • Use the getter method, and understand that performance benefits will mainly be seen where the result from the property is cached and reused in the same context.
  • Implement your own version of the CompiledQuery class.
  • Reuse DataContexts (typically not recommended! You really shouldn’t…).
  • Stick with .NET 3.5 (ick).
  • Update: RyanF below details sharing a MappingSource below in the comments. This is by far the best solution.

You May Pay Even If You Do Everything Right (CryptoLocker)

Written by William Roush on January 13, 2014 at 7:14 pm

Many people in the IT field are depending on various products to protect them from CryptoLocker and similar malware, but how realistic is that really?

Seth Hall over at tunnl.in wrote an article detailing how many parts of your system must fail in order for CryptoLocker to infect your network. The major problem I have with the article is that this level of trust in your systems to protect you is exactly how a lot of companies got bit by the CryptoLocker ransomware, and the concept that "if you have these bases covered, you’re ok".

You’ll need an email server willing to send you infected executable attachments.

This assumes that CryptoLocker is going to come in a form that your email server will catch. One of the easiest ways to prevent your email server from blocking a piece of malware attached to an email is to password protect it. Which CryptoLocker has been known to do [1] [2] [3]. This leaves a handful of options in detecting the email: Either have a signature for the encrypted zip file, which if unique passwords are being used per email that wouldn’t work, or attempt to unencrypt all zips by searching the body of the email for the password (which I don’t think any mail filtering services do this).

And that is all dependent on the idea that you’re being infected by an already detected derivative of CrytpoLocker.

Your perimeter security solution will have to totally fail to spot the incoming threat.

Here Seth is talking about Firewall based anti-malware scanning. Again this falls into all of the same problems as relying on your email server to protect you.

Your desktop security solution will have to totally fail.

This is one of the major ones everyone relies on, your desktop antivirus catching malware, and by far this is what bit almost everyone infected by CryptoLocker. In my previous post about CryptoLocker I talk about how it wasn’t till 2013-11-11 that antiviruses were preventing CryptoLocker. With PowerLocker on the horizon these assumptions are dangerous.

Your user education program will have to be proven completely ineffective.

Now this is one of the major important parts of security, and by far one of the largest things that irk me in IT. I’ll go into this more in a more business-oriented post, but it comes down to this: what happens when I allow someone into the building that doesn’t have an access card? Human Resources would have my head and I could very well lose my job (and rightfully so!). Why is it that IT’s policies get such lackluster enforcement at most places?

In general, IT policies and training is always fairly weak. Users often forget (in my opinion: because there is no risk to not committing it to memory), and training initiatives are rarely taken seriously. People who "don’t get computers" are often put into positions were they’ll be on one for 8 hours a day (I’m not talking IT level proficiency, I’m talking "don’t open that attachment").

I feel this is mostly due to the infancy of IT in the workplace at many places, and will change as damages continue to climb.

Your perimeter security solution will have to totally fail, a second time.

It really depends on how you have your perimeter security set up. Some companies are blocking large swaths of the internet in an attempt to reduce the noise you get from various countries which they do not do business with and only receive attempts to break into their systems. This is pretty much the only circumstance your perimeter security will stop this problem.

Your intrusion prevention system […] will have to somehow miss the virus loudly and constantly calling out to Russia or China or wherever the bad guys are.

This is by far a dangerous assumption. CryptoLocker only communicates to a command and control server for a public key to encrypt your files with. I’d be thoroughly impressed by a system that’ll catch a few kilobytes of encrypted data being requested from a foreign server and not constantly trigger false alerts from normal use of the internet.

Your backup solution will have to totally fail.

This is by far in my opinion the only realistic "this is 100% your responsibility with a nearly 100% chance of success" on this list. Backups that have multiple copies, stored cold and off-site have nearly no chance of being damaged, lost or tampered with. Tested backups have nearly no chance of failing. Malware can’t touch what it can’t physically access, and this will always be your ace in the hole.

In Conclusion

And don’t take this post wrong! The list that Seth gives is a great list of security infrastructure, procedures and policies that should be in place. However I think it reads as if you won’t get infected as long as you follow his list, and that is not entirely accurate.