Every one of us has used the sorting in our applications. Rarely the sorting is implemented by ourselves. In most of the cases the IComparer<T> interface and TComparison<T> functions are used.
To sort a collection the following syntax is used:
FCollection.Sort(TComparer<TMyType>.Construct( function (const L, R: TMyType): Integer begin Result := L.Value – R.Value; end));
If you are using the Spring4d framework you can omit the comparer construction and just throw the comparison function directly. Nothing to improve, right?
Multi-field sorting
When it comes to multi-field sorting (ordering) it is also simple and usually done like following:
FCollection.Sort(TComparer<TMyType>.Construct( function (const L, R: TMyType): Integer begin Result := L.Value1 – R.Value1; if Result = 0 then Result := L.Value2 – R.Value2; end));
What I do not like from readability point of view is the following things:
- You write code to get value for the left/right operand twice (once for left, once for right);
- You need to care about “comparer – specific” business with integer comparison result;
- For nested sorting, you need to add the nested IF statements.
Ordering like a pro
Lets sort the list of record by 3 fields now (by ParentID field descending, then by Name case insensitive and then by ID).
type TRec = record ID, ParentID: Integer; Name: String; end; var LList: IList<TRec>;
So instead of the following code:
LList.Sort( function (const L, R: TRec): Integer begin Result := R.ParentID - L.ParentID; if Result = 0 then begin Result := CompareText(L.Name, R.Name); if Result = 0 then Result := L.ID - R.ID; end; end);
we will have the following code:
LList.Sort(TComparerEx<TRec>.Create .OrderByDesc( function (const A: TRec): Integer begin Result := A.ParentID; end) .OrderBy( function (const A: TRec): String begin Result := A.Name; end) .OrderBy( function (const A: TRec): Integer begin Result := A.ID; end));
And the TComparerEx<T> class implements the IComparerEx<T> interface which descends from the standard IComparer<T>. Each call to the OrderBy constructs a copy of the comparer so that the source instance is not affected in case if you are passing the comparer to the outer world and want to append comparisons there. You can also use Ascending and Descending methods to get a new instance of the comparer with changed direction of the last comparison.
When you use the extended comparer, you get following advantages:
- Better readable code as there is no comparer-specific code (in most cases);
- More transparent for the complex types;
- You write the value retrieve/lookup only once and not twice;
- Unit tests are much easier for the standard data types as there is no branching and you do not need to test the TComparerEx<T> itself every time.
What you lose:
- Some kind of compactness in a line count (subjectively negative);
- Performance drop – not noticeable unless you sort multiple hundreds of thousands of records by 3 fields (for 600’000 records of TRec it is 1050ms vs 440ms when using standard comparison with nested IFs).
The source code for the TComparerEx is available on GitHub. You can use it on your own risk, modify it as you want – I do not give you any guarantees on that.
Delphi still lacks of support for lambda expressions. If there will be lambda support then we will be able to make sorting similar to C# or Java languages in the very nice way like:
FList.OrderByDesc(x => x.ParentID).ThenBy(x => x.Name).ThenBy(x => x.ID);
The performance drop comes from the string comparing part. While in the “all in one” delegate the compiler can generate quite efficient code
CompareText(left.Name, right.Name);
mov edx,[esi+$08]
mov eax,[ebx+$08]
call CompareText
it generates terribly inefficient code for the OrderByCustom generated delegate. I already optimized some code to determine case sensitivity and ordering direction when creating the comparer and it still looks like this:
Comparers.pas.480: begin
004E4358 55 push ebp
004E4359 8BEC mov ebp,esp
004E435B 6A00 push $00
004E435D 6A00 push $00
004E435F 6A00 push $00
004E4361 53 push ebx
004E4362 56 push esi
004E4363 894DFC mov [ebp-$04],ecx
004E4366 8BF2 mov esi,edx
004E4368 8BD8 mov ebx,eax
004E436A 33C0 xor eax,eax
004E436C 55 push ebp
004E436D 68B5434E00 push $004e43b5
004E4372 64FF30 push dword ptr fs:[eax]
004E4375 648920 mov fs:[eax],esp
Comparers.pas.481: Result := -CompareText(AFunc(L), AFunc(R));
004E4378 8D55F8 lea edx,[ebp-$08]
004E437B 8B45FC mov eax,[ebp-$04]
004E437E FF530C call dword ptr [ebx+$0c]
004E4381 8B45F8 mov eax,[ebp-$08]
004E4384 50 push eax
004E4385 8D55F4 lea edx,[ebp-$0c]
004E4388 8BC6 mov eax,esi
004E438A FF530C call dword ptr [ebx+$0c]
004E438D 8B45F4 mov eax,[ebp-$0c]
004E4390 5A pop edx
004E4391 E83AB0F3FF call CompareText
004E4396 F7D8 neg eax
004E4398 8BD8 mov ebx,eax
Comparers.pas.482: end);
004E439A 33C0 xor eax,eax
004E439C 5A pop edx
004E439D 59 pop ecx
004E439E 59 pop ecx
004E439F 648910 mov fs:[eax],edx
004E43A2 68BC434E00 push $004e43bc
004E43A7 8D45F4 lea eax,[ebp-$0c]
004E43AA BA02000000 mov edx,$00000002
004E43AF E88866F2FF call @UStrArrayClr
004E43B4 C3 ret
004E43B5 E93E5CF2FF jmp @HandleFinally
004E43BA EBEB jmp $004e43a7
004E43BC 8BC3 mov eax,ebx
004E43BE 5E pop esi
004E43BF 5B pop ebx
004E43C0 8BE5 mov esp,ebp
004E43C2 5D pop ebp
004E43C3 C3 ret
That’s because of the implicit variable the compiler generates because of the function call.
While this might look bad at first you would get a similar performance result if you put Name as a property with a getter function.
Bottom line: yes, if you compare fields of a record or directly field backed properties you might get a performance drop – but that only results in the indirection created by the anonymous “getter” functions.
In my test with 600k records and Name being a property backed by a function getter I got almost equal results of both ways (including my aforementioned optimizations).
LikeLiked by 1 person
First of all, i like the idea. If Delphi would support lambda expressions you would get rid of the line overhead (permanently declaring function bla bla bla). One thing i have to tell is about the following code snippet you posted:
LList.Sort(
function (const L, R: TRec): Integer
begin
Result := R.ParentID – L.ParentID;
if Result = 0 then
begin
Result := CompareText(L.Name, R.Name);
if Result = 0 then
Result := L.ID – R.ID;
end;
end);
I am not sure but i think you can just skip the nested if like:
LList.Sort(
function (const L, R: TRec): Integer
begin
Result := R.ParentID – L.ParentID;
if Result = 0 then
Result := CompareText(L.Name, R.Name);
if Result = 0 then
Result := L.ID – R.ID;
end);
I mean its still not readable, but more readable than the code sample above.
LikeLiked by 2 people