AP Joining Issue to WLC Running 8.0.100.0

In this post I will discuss about the issue faced today while joing AP to WLC version 8.0.100.0.

5 Day before I got a new 2602 AP and Today I tried to connect to my switch in right AP VLAN. I saw that AP got IP address from DHCP pool and WLC IP via DHCP Option 43 and AP start updating the Image from WLC.

I was relaxed that it is working so I will test my Important topic like Auto Anchor, Static IP tunneling & Foreign mapping.

After 1-2 minutes I saw that there was some kind of failure which I never seen, here are the logs:

TestAP#
 *Nov 19 13:37:59.999: AP has SHA2 MIC certificate - Using SHA2 MIC certificate for DTLS.
 *Nov 19 13:38:00.000: %CAPWAP-5-DTLSREQSEND: DTLS connection request sent peer_ip: 192.168.10.3 peer_port: 5246
 *Nov 19 13:38:29.999: DTLS_CLIENT_ERROR: ../capwap/base_capwap/dtls/base_capwap_dtls_connection_db.c:2214 Max retransmission count reached for Connection 0x8D69EB4!
 *Nov 19 13:38:59.999: %DTLS-5-SEND_ALERT: Send FATAL : Close notify Alert to 192.168.10.3:5246
 *Nov 19 13:38:59.999: AP has SHA2 MIC certificate - Using SHA2 MIC certificate for DTLS.
 *Nov 19 13:39:00.000: %CAPWAP-5-DTLSREQSEND: DTLS connection request sent peer_ip: 192.168.10.1 peer_port: 5246Peer certificate verification failed FFFFFFFF
 *Nov 19 13:39:00.099: DTLS_CLIENT_ERROR: ../capwap/base_capwap/capwap/base_capwap_wtp_dtls.c:496 Certificate verified failed!
 *Nov 19 13:39:00.099: %DTLS-5-SEND_ALERT: Send FATAL : Bad certificate Alert to 192.168.10.1:5246
 *Nov 19 13:39:00.099: %DTLS-5-SEND_ALERT: Send FATAL : Close notify Alert to 192.168.10.1:5246
 TestAP#

After googling I got this: APs mfg in September/October 2014 unable to join an AireOS controller CSCur43050

Description

Symptom:
New Aironet APs with factory installed recovery IOS are able to join the controller 8.0.100.0 and download 15.3(3)JA IOS. But after the AP reload, the APs are unable to join the controller. On the AP, logs similar to the following are seen:

*Oct 16 12:39:06.231: AP has SHA2 MIC certificate - Using SHA2 MIC certificate for DTLS.
 *Oct 16 13:14:56.000: %CAPWAP-5-DTLSREQSEND: DTLS connection request sent peer_ip: ***.***.***.*** peer_port: 5246Peer certificate verification failed FFFFFFFF
 *Oct 16 13:14:56.127: DTLS_CLIENT_ERROR: ../capwap/base_capwap/capwap/base_capwap_wtp_dtls.c:496 Certificate verified failed!
 *Oct 16 13:14:56.127: %DTLS-5-SEND_ALERT: Send FATAL : Bad certificate Alert to ***.***.***.***:5246
 *Oct 16 13:14:56.127: %DTLS-5-SEND_ALERT: Send FATAL : Close notify Alert to ***.***.***.***:5246

Another symptom of this problem is that the AP may be able to join the 8.0.100.0 controller, download the IOS code, boot up and join the controller OK … but when it goes to upgrade to newer 8.x code, it gets stuck in a loop failing the download.

Conditions:
Seen only with APs that were manufactured in September or October, 2014 – all Aironet APs were affected EXCEPT the 700 series. Seen with WLCs running 8.0.100.0 or an 8.0.100.x special.

If the WLC was manufactured in September 2014, or later (i.e. has a SHA2 MIC), then the first symptom is seen, i.e. the AP joins the 8.0.100 WLC, downloads the image, but then fails to rejoin.

If the WLC was manufactured before September 2014 (i.e. does not have a SHA2 MIC), then the second symptom is seen, i.e. the AP can join the 8.0.100 WLC OK, but then will fail download during a subsequent upgrade.

Also seen with new APs trying to join a controller running IOS-XE 3.6.0 (15.3(3)JN k9w8 image.) (Track CSCur50946 for the IOS-XE fix)

Workaround:
Downgrade to AireOS 7.6.130.0, or to IOS-XE 3.3, if the APs are supported in the earlier code.

Further Problem Description:
This problem affects only APs that were manufactured with incorrect SHA2
certificates. APs with only SHA1 certificates are not affected. To determine
whether an AP is affected, use the following AP exec commands (while the AP
has a 15.3(3)/8.0 image installed):

1. Check for the presence of a SHA2 Parameter Block:

ap#test pb display

if the output of this command includes:

SHA2 Parameter Block Doesn’t have any Records

then this AP is not affected. If the output of this command shows

Display of the SHA2 Parameter Block

then

2. See whether a correct SHA2 certificate is present:

ap#show crypto pki trustpoints | include SHA2

if there is no valid SHA2 certificate, then this will show no output.
If there is a valid SHA2 cert, this will show:

cn=Cisco Manufacturing CA SHA2

Only APs which *do* have a SHA2 Parameter Block and which *do not* have
a valid SHA2 certificate are affected by this bug.

The problem symptoms will vary according to whether or not the WLC has a
SHA2 certificate installed. To verify this, use the following command on
the AireOS CLI:

Cisco Controller) >show certificate all
and look for:
Certificate Name: Cisco SHA2 device cert

Then I downgraded my WLC to version 7.6.130.0 and it worked.
So this just a small post, it may help those who is/will get this kinda problem.

Lightweight Access Point joining issues to WLC Part 1

In this post I will try to cover as many as possible problems due to AP can not join to WLC.

First of all we should know that there are two types of Access Points (I am only talking about Cisco products):

  1. Autonomous AP or Standalone AP
  2. Lightweight AP

Autonomous AP doesn’t need WLC to connect and it can be used in small office / Home office scenarios. (I will not go into detail, may in later post we will see that, how it works and configuration).

Lightweight AP: This type of AP can only be used with Wireless LAN Controllers. These can be used in medium to large deployments.

How to verify if it’s an autonomous AP or Lightweight?

Here are the two ways:

  • Connect to the AP using a console cable, and login to the AP (if you need to enter credentials, default username pass are Cisco, default enable password is Cisco). As a side note, the autonomous AP code prompts by default ap> and only requires you to enter an enable passowrd. The lightweight code asks you for username and password by default, and display by default the AP MAC address as a prompt. So this might be a first indication, but all this can be changed through configuration, so this is just a note, not an exact way yet.
  • On the AP console, type show version. If the AP runs an autonomous code, the version will show the string k9w7. If the AP runs a lightweight code, the version will show k9w8.
  • Want to know more about AP versions, Go here:  Understand AP IOS Images

Now we know that only LAP have to join WLC, without WLC this these kinds of AP will not work.

Before starting to find out the cause why AP not joining, first we must understand the behind the scene.

In order for the WLC to be able to manage the LAP, the LAP should discover the controller and register with the WLC. There are different methods that an LAP uses in order to discover the WLC.

There are for main events occurs:

  1. Discovery Requests
  2. Discovery Response
  3. Join Request
  4. Join Response

Refer to: LAP Registration to WLC

So now we assume that AP got the IP address, either statically or via DHCP.

Without IP AP will not do anything, so first we need to assign a IP to AP then only it can send discovery request.

Basic things to check:

  1. Is AP got IP via DHCP?
  2. Can you ping AP from WLC or vice versa.
  3. Is this specific VLAN (in which AP got the IP) blocked by anything on switch like STP?
  4. Check the logs on AP: it must start the discovery request for WLCs.

Till now if everything is ok then we can start with some command issues due to which LAP not join to WLC.

Scenario 1: Mismatch in Regulatory Domain

I have seen this errors many times:

We must enable debug capwap <events/error> enable or debug lwapp <events/error> enable

Sample Error Logs:

802.a or 80211bg Regulatory Domain (-E) does not match with country(AU )
AP RegDomain check for the country AU failed
Regulatory Domain check Completely FAILED The AP will not be allowed to join

These errors clearly show that there is a mismatch in the regulatory domain of the LAP and the WLC. To resolve this issue, add the country for which the AP was built to the list of countries supported on the controller from Wireless > Country. We have to disable all 802.11b/g and 802.11a radios to change the controller country codes list.

wirelesscountry

In my example, I only configured DE, this Country supports -E-   regulatory domain on AP:

The WLC can supports multiple regulatory domains but each regulatory domain must be selected before an LAP can join from that domain. When you purchase APs and WLCs, ensure that they share the same regulatory domain. Only then can the LAPs register with the WLC.

Here you can check the Wireless Compliance Status, specific country with specific Regulatory domain for Access Points.

Scenario 2: Certificate and Time

AP and controller needs to exchange certificate to create a secure tunnel for communication. These Certificates have creation and expiry date. If the time and date on WLC are wrong, the AP certificate will be refused because if it is not valid yet or not valid anymore.

We must run these debug commands to find out the exact error:

debug capwap errors enable and debug pm pki enable

Sample Error logs:

Does not include valid certificate in CERTIFICATE_PAYLOAD from AP MACADDRESS. Unable to free public key.
Current time outside AP cert validity interval: make sure the controller time is set.

To resolve this kind of issue, set the controller time and date to a present value from GUI: Command > Set Time or config time command from CLI

command_settime

We can also receive this kind of message if AP certificate is not valid anymore or corrupted: In this case we must return this AP to our supplier and take a new one.

We can check the AP certificate validity by this command: show crypto ca certificates

 Scenario 3: Firewall Blocking Necessary Ports

When APs and controllers are in different subnets, make sure that routing and firewall filters allow traffic both ways.

Enable these UDP ports for LWAPP traffic:

UDP ports 12222 and 12223 must be open in both directions.

Enable these UDP ports for CAPWAP traffic:

UDP ports 5246 and 5247 must be open in both directions.

If the AP cannot access the controller on UDP port 5246 (CAPWAP Control), the discovery and join requests never reach the controller. The result is that the AP is not seen on the controller, and the debug capwap event enable command on the controller does not display any message about the AP.

If the controller cannot access the AP UDP port 5246 (CAPWAP Control), the discovery and join requests never reach the AP. The result is that the controller receives discovery requests, answers with discovery responses, but the AP does not get these responses and never moves to the join phase.

Scenario 4: Brand New Access Points

With new Access Points or even with the old AP, we can get some compatibility issues with WLC version.

Example: The 1600 and 3600 APs are new models, and require new controller codes. The 1600 AP requires controller code release 7.4.100.0 or later, and the 3600 AP requires controller code 7.2 or later. The same issue affects 802.11n APs and older controller codes. If the controller code is too old, the AP model is not recognized.

We must run these debug commands to find out the exact error:

debug capwap errors enable

Sample Error Logs

AP Associated. Base Radio MAC: MAC ADDRESS
AP Disassociated. Base Radio MAC: MAC ADDRESS
AP with MAC MAC ADDRESS is unknown.

To resolve this issue, we have to upgrade the controller code or have the AP discover a controller running the appropriate code version.

Check here the Cisco Software Compatibility Matrix

Find out the version on WLC here by GUI: Go to Monitor and check the Controller Summary

controller_summary

 

Via CLI:

(WLAN1) >show sysinfo
Manufacturer's Name.............................. Cisco Systems Inc.
Product Name..................................... Cisco Controller
Product Version.................................. 7.0.240.0
RTOS Version..................................... 7.0.240.0
Bootloader Version............................... 4.0.191.0
Emergency Image Version.......................... N/A
Build Type....................................... DATA + WPS
System Name...................................... WLAN1
System Location.................................. Test Lab
System Contact................................... Sandeep
System ObjectID.................................. 1.3.6.1.4.1.9.1.828
IP Address....................................... 10.99.80.1
System Up Time................................... 3 days 23 hrs 12 mins 31 secs
System Timezone Location......................... (GMT +1:00) Amsterdam, Berlin, Rome, Vienna
Configured Country............................... DE  - Germany
Operating Environment............................ Commercial (0 to 40 C)
Internal Temp Alarm Limits....................... 0 to 65 C
Internal Temperature............................. +42 C

 

Part 2 coming soon……. 🙂