LINQ uses a Deferred Execution model which means that nothing really happens until the results of the query are accessed, e.g. in a for(each)-loop. One of the advantages of this model is that you can compose complex queries in multiple steps to make them more readable. So from a execution point of view I expected that it should not matter whether you create the query in one complex statement, or in multiple smaller statements. But unfortunately that’s not always the case…
To see the difference I’ve created an XML document and two simple queries that create Album objects when the where-clause matches.
The ‘complex’ query:
var query1 = from a in albums.Descendants("Album") where a.Element("Artist").Value == "Radiohead" select new Album { Artist = a.Element("Artist").Value, Title = a.Element("Title").Value };
The decomposed, more readable query:
var query2 = from a in albums.Descendants("Album") select new Album { Artist = a.Element("Artist").Value, Title = a.Element("Title").Value }; query2 = from a in query2 where a.Artist == "Radiohead" select a;
The result from both queries is exactly the same, but they execute differently. What happens in Query2 is that first a list of all Album objects is created and then the ‘where’ part is evaluated over each object. This is different compared to Query1 in which an Album object is only created when it matches the ‘where’ part. So in this example Query1 is more efficient.
Is this what we should expect of deferred execution with LINQ? Yes, at least on implementations based on IEnumerable
With implementations based on IQueryable
In this example it isn’t that much a problem but when more data is involved this is definitely something to be aware of when composing LINQ queries.