Join Operators

Joining operations associate objects of one data source with another through a common attribute.

These operations prove critical in queries targeting data sources lacking clear, direct relationships; for example, if a customer class has a City property and the City class lacks a Customer object collection property, joins allow for finding all the customers within each city.

LINQ provides two join operator methods: Join and GroupJoin. These methods perform joins as a result of key equality, i.e., equijoins.

The Join method utilizes key selector functions to join two sequences and extract value pairs. It implements an inner join meaning it only returns objects with a match in the other data set. It utilizes the following syntax:

public static IEnumerable<TResult> Join<TOuter, TInner, TKey, TResult>(
	this IEnumerable<TOuter> outer,
	IEnumerable<TInner> inner,
	Func<TOuter, TKey> outerKeySelector,
	Func<TInner, TKey> innerKeySelector,
	Func<TOuter, TInner, TResult> resultSelector
)

Review an example of its use below:

class Location
{
	public string Name { get; set; }
}
class Equipment
{
	public string Name { get; set; }
	public Location Inventory { get; set; }
}
public static void JoinOne()
{
	Location Pune = new Location { Name = "Pune" };
	Location Gzhou = new Location { Name = "Guangzhou" };
	Location Jak = new Location { Name = "Jakarta" };
				
	Equipment LCfive = new Equipment { Name = "LaserCut 5000", Inventory = Pune };
	Equipment Rolltwo = new Equipment { Name = "Roller 2000", Inventory = Pune };
	Equipment SPfour = new Equipment { Name = "SteamPress 400", Inventory = Jak };
				
	List<Location> locations = new List<Location> { Pune, Gzhou, Jak };
	List<Equipment> equips = new List<Equipment> { LCfive, Rolltwo, SPfour };
				
	// Make a list of Location-Equipment pairs with
	// each element as an anonymous type containing
	// equipment name and the location name.
	var query =
		locations.Join(equips,
			location => location,
			equipment => equipment.Inventory,
			(location, equipment) =>
				new { LocationName = location.Name, Equipment = equipment.Name });

	foreach (var obj in query)
	{
		Console.WriteLine(
			"{0} - {1}",
			obj.LocationName,
			obj.Equipment);
	}
}

The GroupJoin method utilizes key selector functions to join two sequences, and for each element, it groups resulting matches. It implements a superset of both inner joins and left outer joins. Left joins return all elements of the first (left) source regardless of whether matches exist in the other source. It utilizes the following syntax:

public static IEnumerable<TResult> GroupJoin<TOuter, TInner, TKey, TResult>(
	this IEnumerable<TOuter> outer,
	IEnumerable<TInner> inner,
	Func<TOuter, TKey> outerKeySelector,
	Func<TInner, TKey> innerKeySelector,
	Func<TOuter, IEnumerable<TInner>, TResult> resultSelector
)

Review an example of a groupjoin below:

class Location
{
	public string Name { get; set; }
}
class Equipment
{
	public string Name { get; set; }
	public Location Inventory { get; set; }
}
public static void JoinOne()
{
	Location Pune = new Location { Name = "Pune" };
	Location Gzhou = new Location { Name = "Guangzhou" };
	Location Jak = new Location { Name = "Jakarta" };
				
	Equipment LCfive = new Equipment { Name = "LaserCut 5000", Inventory = Pune };
	Equipment Rolltwo = new Equipment { Name = "Roller 2000", Inventory = Pune };
	Equipment SPfour = new Equipment { Name = "SteamPress 400", Inventory = Jak };
				
	List<Location> locations = new List<Location> { Pune, Gzhou, Jak };
	List<Equipment> equips = new List<Equipment> { LCfive, Rolltwo, SPfour };
				
	// Make a list with each element as an anonymous
	// type containing a Location name and
	// a collection of owned equipment names.
	var query =
		locations.GroupJoin(pets,
			location => location,
			equipment => equipment.Inventory,
			(location, equipmentCollection) =>
				new
				{
					LocationName = location.Name,
					Equips = equipmentCollection.Select(equipment => equipment.Name)
				});
				
	foreach (var obj in query)
	{
		// Output the owner.
		Console.WriteLine("{0}:", obj.LocationName);
		// Output equipment with owners.
		foreach (string equipment in obj.Equips)
		{
			Console.WriteLine(" {0}", equipment);
		}
	}
}

Use the DefaultIfEmpty method in combination with GroupJoin to achieve a left outer join. Either null or a user-defined type can be used as the default values of reference types. Review an example of a left outer join below:

var LeftOutrQry =
	from genre in genres
	join recording in recordings on genre.ID equals record.GenreID into recordGroup
	from item in recordGroup.DefaultIfEmpty(new Recording { Name = String.Empty, GenreID = 0 })
	select new { GenreName = genre.Name, RecordName = record.Name };

EQUALITY

Join clauses perform equijoins, utilizing the equals keyword rather than the == operator, meaning matches only stem from key equality. Other comparison types, e.g., “greater than,” require custom solutions. The equals keyword only appears in a join clause and in its use, unlike the == operator, the left key consumes the sequence of the outer source while the right key consumes the inner. The outer source only exists in scope on the left side and the inner only exists on the right.

CUSTOM JOINS

LINQ lacks non-equijoins, cross joins, and other options; however, the insertion of multiple from clauses achieves the same function. This proves critical in joins dependent on more than a single equality or inequality expression, or when using temporary range variables for the inner sequence prior to joining.

Use multiple from clauses to insert data sources independently. Then apply a predicate expression in a where clause to each source's range variable. Never confuse this manipulation with the use of multiple from clauses for inner collection access.

The example below achieves a cross join:

class CustomOp
{
	#region Data

	class Recording
	{
		public string Name { get; set; }
		public int GenreID { get; set; }
	}

	class Genre
	{
		public string Name { get; set; }
		public int ID { get; set; }
	}

	// 1st data source.
	List<Genre> genres = new List<Genre>()
	{
		new Genre(){Name="Electro", ID=001},
		new Genre(){ Name="Jazz", ID=002},
		new Genre(){ Name="Traditional", ID=003},
	};

	// 2nd data source
	List<Recording> records = new List<Recording>()
	{
		new Product{Name="Prince", GenreID=002},
		new Product{Name="1999", GenreID=001},
		new Product{Name="Dirty Mind", GenreID=001},
		new Product{Name="For You", GenreID=003},
		new Product{Name="Controversy", GenreID=001},
	};

	#endregion

	static void Main()
	{
		CustomJoins app = new CustomJoins();
		app.CrossJoin();
		app.NonEquijoin();

		Console.WriteLine("Press a key to continue.");
		Console.ReadKey();
	}
	void CrossJoin()
	{
		var crossJoinQuery =
			from g in genres
			from r in records
			select new { g.ID, r.Name };
		Console.WriteLine("Cross Join:");
		foreach (var x in crossJoinQuery)
		{
			Console.WriteLine("{0,-5}{1}", x.ID, x.Name);
		}
	}

	void NonEquijoin()
	{
		var nonEquijoinQuery =
			from r in records
			let gnrIds = from g in genres
						 select g.ID
			where gnrIds.Contains(r.GenreID) == true
			select new { Recording = r.Name, GenreID = r.GenreID };
		Console.WriteLine("Non-equijoin query:");
		foreach (var n in nonEquijoinQuery)
		{
			Console.WriteLine("{0,-5}{1}", n.GenreID, n.Recording);
		}
	}
}