Wednesday, April 25, 2012

- LINQ to Object

LINQ:
A strongly typed query language, embedded directly into the grammer of C# itself. LINQ can applied to any number of data stores, and it doesn't have anything to do with a literal relational database. As a matter of fact, each C# LINQ query operator is a shorthand notation for making a manual call on an underlying extension method, typically defined by the System.Linq.Enumerable. The LINQ API is an attempt to provide a consistent , symmetrical manner in which programmers can obtain and manipulate "data" in the broad sense of the term.
If you encounter an error similar to the following error make sure you have include the using System.Linq:
Error 1 Could not find an implementation of the query pattern for source type 'int[]'. 'Where' not found. Are you missing a reference to 'System.Core.dll' or a using directive for 'System.Linq'
Let's start with a simple example:

//string array of cars
string[] cars = { "BMW", "Lamborghini", "Bugatti", "Ferrari", "Lexus", "Kancil" };

// find cars tha contains 'u' and sort it.
var subset = from car in cars
                           where car.Contains("u")
                           orderby car
                           select car;
foreach(var car in subset)
    Console.WriteLine ("car = {0}", car);
LINQ example output
Underlying type of a LINQ query is not clear, that is why it is always recommended to use implicit typing (var keyword) to get the result of LINQ query.

Deferred Execution:
LINQ query expressions will be evaluated when you iterate over the sequence in order to apply the same LINQ query multiple times to the same container, and reset assured you are obtaining the latest result. Here is an example:

int[] numbers = { 1, 5, 26, 2, 3, 4 };
//find numbers less than 10
var subset = from num in numbers
             where num < 10
             orderby num
             select num;
foreach (var i in subset)
    Console.Write("{0} ", i);
Console.WriteLine();
numbers[0] = 55;
Console.WriteLine("after modification :");
foreach (var i in subset)
    Console.Write("{0} ", i);
Deferred Execution example output
During debugging as you see in below screen shot you can be evaluated to see the result:
Debugging screen shot for LINQ var
Immediate Execution:
There are some extension methods defined by Enumerable type such as ToArray<T>(), ToDictionary<TSource, TKey>(), and ToList(). Which cause a LINQ query to execute at the exact moment you call them. Consider the following example:

int[] intSubset = (from num in numbers
                    where num < 10
                    orderby num
                    select num).ToArray<int>();

Since var keyword cannot be used for fields (neither parameters or return values), you cannot make use of implicit typing when you want to define a field within a class or struct. In addition target of the LINQ query cannot be instance level data, therefore it must be static. Such as below example:

private static string[] colors = { "Blue", "Red", "Green" };
private IEnumerable<string> subset = from color in colors
                                        where color.Contains("r")
                                        orderby color
                                        select color;
LINQ to object  can be used on any type implementing IEnumerable<T>, however it is possible to iterate over data contained within non-generic collections using the generic Enumerable.OfType<T>() extension method (which doesn't extend generic type). When calling this member off a non-generic container implementing the IEnumerable interface (e.g. ArrayList), specify the type of the item within the container to extract a compatible IEnumerable<T> object. As an example let's create a student class


class Student
{
    public string Name { get; set; }
    public double CGPA { get; set; }
    public string Major { get; set; }

}

Now lets create an ArrayList of Student in the main:

//non-generic collection of students
ArrayList studentArrayList = new ArrayList(){
    new Student{Name = "ALex", CGPA =3.2, Major = "Software Engineering"},
    new Student{Name = "Kenneth", CGPA =2.5, Major = "Management"},
    new Student{Name = "Sam", CGPA =3.4, Major = "Software Engineering"},
    new Student{Name = "Robert", CGPA =3.9, Major = "Software Engineering"},
    new Student{Name = "James", CGPA = 3.5, Major = "Multimedia"},
    new Student{Name = "John", CGPA =2.0, Major = "Engineering"}
};

/Transform ArrayList into an IEnumerable<T> compatible types
var studentEnum = studentArrayList.OfType<Student>();

var goodStudents = from s in studentEnum where s.CGPA > 3.2 select s;

foreach(var student in goodStudents)
    Console.WriteLine("{0,-10} ->CGPA = {1}",student.Name, student.CGPA);
non-generic container output example

When you have type containing any combination of items you may use OfType<T>() to filter out specific type.
For more information you can refer to LINQ General Programming Guide in msdn.
System.Linq.Enumerable class also provides following extension methods:
  • Transform a result set in various manners: Reverse<>(), ToArray<>(). ToList<>(),..
  • Some extract singleton from a result set, others perform various set of operations: Distinct<>(), Union<>(), Intersect<>(), ...
  • aggregate results: Count<>(), Sum<>(), Min<>(), Max<>(),...
Since LINQ is validated at compile time, ordering of operators is critical.

Projecting New data type:
 You can use of anonymous type after select, however if you need to return the result to the caller you must convert it to container which is member of the Enumerable type. such as below example:
....
var goodStudents = from s in studentEnum where s.CGPA > 3.2 select new {s.Name, s.Major);
return goodStudents.ToArray(); 
....
Let's say you want to count the students in Student example:

int numOfSetudents = goodStudents.Count<Student>();

Except() example:

//students with CGPA > 3.2
var goodStudents = from s in studentEnum where s.CGPA > 3.2 select s;
//software engineering students
var softwareEngineers = from s in studentEnum where s.Major == "Software Engineering" select s;

//student with CGPA > 3.2 and not software engineer
var selectedStudents = (from s1 in goodStudents select s1)
    .Except(from s2 in softwareEngineers select s2);
           
foreach (var student in selectedStudents )
    Console.WriteLine("{0,-8} ->CGPA = {1,-8} Major = {2,-8}"
        ,student.Name, student.CGPA, student.Major);
result of Except() exmaple
Intersect() example:
//student with CGPA > 3.2 and software engineer
var selectedStudents = (from s1 in goodStudents select s1)
    .Intersect(from s2 in softwareEngineers select s2);
Intersect() output example
 Union() example: (there wouldn't be any repeating values)

//students with CGPA > 3.2 or software engineer
var selectedStudents = (from s1 in goodStudents select s1)
    .Union(from s2 in softwareEngineers select s2);
Union() example output
Concat() is just like Union() with repeating values. However if you wish to remove Duplicates you can use of Distinct()


Reference: Pro C# 2010 and the .NET 4 Platform by Andrew Troelsen.

No comments:

Post a Comment