A customer using on-prem Skype for Business 2015 (SFB) had a problem with Caller ID via their SIP trunking provider (the "provider"). What started as a simple "this should be easy" uncovered a mountain of challenges – enough to make a good blog article!
The problem was that although Caller ID was working fine when making calls from the SFB client to PSTN numbers via the provider, when it came to new Response Groups that redirect to PSTN numbers, the line was just hanging up.
What was going on?
The investigation started out at quite a high level and ended up with a combination of MSPL scripts and Wireshark! Let's start with the basics: the RGS and trunk configurations. (Data has been obfuscated.)
The first thought is: the response group doesn't have permissions to use the trunk (i.e. it doesn't have a voice policy). If you Google "add voice policy to response groups", you'll encounter a number of articles that explain that by default, RGs do not have permissions to use the Enterprise Voice features and therefore cannot redirect to the PSTN world. As per the blogs, a per-user Voice Policy was created and applied to the RG. It made no difference. (Interestingly, when the permission was removed, it still made no difference. At the time of writing, it is unclear whether the problem of RGs not having Enterprise Voice permissions only relates to Lync but not Skype for Business.)
So, that didn't work, next thing to try: a debug trace. That showed a rather uninformative error:
TL_INFO(TF_PROTOCOL) 7F70.886C::02/15/2017-11:02:55.214.000281b0 (SIPStack,SIPAdminLog::ProtocolRecord::Flush:ProtocolRecord.cpp(261)) $$begin_record
Start-Line: SIP/2.0 503 Service Unavailable
CSEQ: 2 REFER
VIA: SIP/2.0/TLS 10.1.1.20:56974;branch=z9hG4bK0525001B.8CA107FA5D9520AF;branched=FALSE,SIP/2.0/TLS 10.1.1.20:56986;branch=z9hG4bK552c3a92;ms-received-port=56986;ms-received-cid=14C00
ms-diagnostics: 10014;source="sfbfe.contoso.com";reason="An internal exception received while processing the incoming request";component="MediationServer";Exception="Failed constructing and sending a message to the network. The failure may be due to network issues or invalid data from the application used to build the message."
ms-diagnostics-public: 10014;reason="An internal exception received while processing the incoming request";component="MediationServer";Exception="Failed constructing and sending a message to the network. The failure may be due to network issues or invalid data from the application used to build the message."
Don't you just hate errors that say "something went wrong but I'm not going to say what"? J
Actually however, there is something vaguely useful:
"The failure may be due to network issues…" OK, time to dig out Wireshark. That showed the problem – or at least the first problem:
Ah ha! So, it looks like the provider doesn't support the REFER verb. In fact, the trace shows that REFER isn't one of the supported verbs (in response to OPTIONS) and a quick phone call to the provider confirmed this. They suggested doing a re-INVITE instead. So, REFER was disabled on the gateway:
… and everything worked. Right? Well, yes and no. "Yes" because the calls being initiated by the RG were working, but “No” because the Caller ID was not working: the actual Caller ID showing was the Conference number! Why was SFB sending the Conference number instead of the original caller ID?
We had a look through the configuration to make sure that it wasn't somewhere buried in there (other than where it should be, of course!), and there was nothing. So, finally, time to dig out our friend Wireshark again. (Or, if you really want something on steroids, use Microsoft Message Analyser: it is insanely powerful!)
I won't list the trace of all the packets, but when a filter was put on (SIP.Message=="INVITE"), it showed something interesting: the conference number was nowhere to be seen! That could only mean one thing: it was coming from the provider itself. Rather than call them (again), we decided to do something heinous and unprecedented: we read the provider's manual! It turned out to be a very good (and technical) manual that went into much detail on how it handles SIP. But first, we took another look at the Wireshark traces – specifically the TO and FROM of the INVITE that is generated by the RG:
The FROM in this case is the number of the original caller. In other words, the RG is making the original caller the caller when establishing the divert. This does make sense. Except for one problem which is stated quite clearly in the Provider's manual:
"Non-provider registered CLIs in the 'from' are contingent upon sending a valid network CLI in the PAID (P Asserted ID) header.... [otherwise] we will automatically present the default CLI, which is the first number in the Provider allocated DDI range."
Ah ha! In this instance, because the RG sends the original caller's ID as the FROM field, of course, it isn't one of the provider's registers numbers. So, it substitutes the first number in its list, which just happens to be the conference number. It explains the behaviour perfectly. Problem solved – just send a valid number in the PAI field and it's problem solved, right? Well, no, sadly not…. We enabled PAI on the trunk as follows expecting it to fix the problem:
It didn't. In fact, not only did it not fix the problem, it actually broke the Caller ID for normal "direct" calls, which started to say "Private Number". It felt like one step forwards two steps backwards! However, the next step revealed two things that we weren't aware of in Skype for Business. Read on…
Time to dig out Wireshark for the third time to see what is going on when a RG diverts the call…
plus… the following which wasn't there before:
So, SFB is definitely sending a P-ASSERTED-IDENTITY (and a less welcome Privacy – more on that later). Notice that the P-ASSERTED-IDENTITY is the same as the FROM field. But the rules say "are contingent upon sending a valid network CLI in the PAID (P Asserted ID) header". Unfortunately, although we are sending a P-ASSERTED-IDENTITY, it's not a number that the provider will recognise, hence why the provider still uses the conferencing number as the Caller ID. How do you get Skype for Business to send a specific P-ASSERTED-IDENTITY? Without custom code or a SBC, you can't. L But, that's never stopped us before, so soldiering on…
Why did it also break normal "direct" outbound calls – which now show as "Private Number"? Again, Wireshark to the rescue…
FROM: "Fred Bloggs"<sip:+firstname.lastname@example.org;user=phone>;epid=blah;tag=blah
P-ASSERTED-IDENTITY: "Fred Bloggs"<sip:email@example.com>,tel:+442071234567
(+442071234567 is a valid provider's number.)
So, why is it now showing "Private Number"? Because that's what Privacy: id means. "The presence of this privacy type in a Privacy header field indicates that the user would like the Network Asserted Identity to be kept private with respect to SIP entities outside the Trust Domain with which the user authenticated." (RFC 3325) So, we can now see how the provider's SBC logic works: if a provider-owned number is in the P-ASSERTED-IDENTITY field, display the FROM number as the Caller ID, unless "Privacy id" is specified, in which case, make the Caller ID "private". Oh no!
But how did the Privacy id field get in there? We certainly didn't specify it. Well, we Googled this one and encountered this: https://social.technet.microsoft.com/Forums/lync/en-US/4e2750a9-6536-4aed-abe3-089f1064f5f6/passertedidentity-and-privacy-id?forum=lyncvoice. A Microsoft moderator confirmed that "it isn't possible to send P-Asserted-Identity without the Privacy ID header. It is designed by hard code." In other words, if you specify to send the "P-ASSERTED-IDENTITY", you get "Privacy: id" as well and there's no choice in the matter (unless you write some code or use a SBC). L
It's not as bad as it sounds, though, as we do have a workaround: disable P-ASSERTED-IDENTITY in the trunk configuration (as it breaks "normal" caller ID) an add it in manually using MSPL when the call is a divert from a RG. That should fix it, right? Well, unfortunately not! And this was the final thing that we learnt about limitations of Skype for Business with "out of the box" functionality.
We finally, created an MSPL script to do exactly the above: look for INVITE to the provider's domain "184.108.40.206" from the RG, and if it finds it, add in a valid provider number in the P-ASSERTED-IDENTITY without the Privacy field. It didn't work! We had successfully trapped the INVITE messages that we were looking for, but when we tried using the AddHeader() function, it kept on erroring. We finally ascertained that the error was telling us that the P-ASSERTED-IDENTITY header already existed.
What?? But the Wireshark trace clearly shows that it doesn't. So, we compared the Wireshark trace against a dump of all the headers that MSPL was seeing. Amazingly, they weren't the same:
When we looked at the header that MSPL was seeing, we saw this in the trace:
However, when looking at Wireshark, it wasn't there. This is why the attempt to insert the "P-Asserted-Identity" using MSPL was failing: it was already there. The workaround? The only one remaining was to use a SBC and re-architect the environment.
If you need to do anything beyond the very straightforward PSTN / VOIP management, even with Skype for Business on-prem, you'll still almost certainly need a SBC!