User Tools

Site Tools


network:eapstroubleshooting

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
Next revision
Previous revision
Last revision Both sides next revision
network:eapstroubleshooting [2018/01/13 10:57]
tschulz [CRC Errors]
network:eapstroubleshooting [2018/01/15 10:56]
tschulz [Jan 15th]
Line 1: Line 1:
 ====== EAPS Troubleshooting ====== ====== EAPS Troubleshooting ======
  
 +===== Log =====
 +==== Jan 12th ====
 +  * After restarting both **Server_Room** and **5th_Floor** I could not get link between the switches to come up on 10G reducing speed to 1G brings the link up but produces a log CRC errors and the connection is drop once every 10-20 secs even when using 1G sftps
 +  * Any fiber run that is dropping seems to have some sort of CRC errors
 +  * run from  **Ag_Room** to **OLD_KSU** giving CRC errors on **OLD_KSU** port 1:53 even after restarting both switches reconfigured both ports to 1G and CRC errors stopped
 +  * After reconfiguring both ports back to 10G the CRC errors seemed to have stopped
 +  * Run from **server room** to **5th floor** stayed down after restarting both switches. ​ Link did come up after reconfiguring both ports to 1G, both both 1:31 on server room switch and 1:54 on 5th floor switch had CRC errors
 +  * As of 8pm on Friday all other switches showing 0 CRC errors
 +  * Fibers in **Ag_Room** are transposed from the port descriptions.
 +==== Jan 13th ====
 +  * Over the night of the 12th of Jan EAPS master reporting ​ <​Info:​EAPS.RxPduLinkDown>​ on **Office** and **OLD_KSU** switches
 +  * reduced speed of **Ag_room** and **OLD_KSU** switches to 1G seeing if CRC errors are causing ​ <​Info:​EAPS.RxPduLinkDown>​ on EAPS master
 +  * **Office** switch is having port 1:54 lose connection every hour or so, even after multiple restarts
 +  * **Elementary_LD** port 1:54 had no issues until about (over 12hrs) 7am on Jan 13th then started loosing uplink every 1-2 min over a 5-10min period. ​ Started again at 10:​30am ​
 +  * **Elementary_LD** was restarted and port 1:54 resumed normal operation
 +  * Switched **Office** to **Elementary_LD** and **5th_Floor** to **Server_Room** over to using 1G sftp on the front of each switch and reconfigured EAPS ring to use different ports as of **12:01pm**
 +  * single mode fiber link between **Elementary_LD** to **Business_AD_RM** started dropping a lot reducing speed to 1G to see if that stops the dropping **12:20pm**
 +  * **12:25pm** After reducing both links to 1G only one side would get a connection reverted to 10G and rebooted **Elementary_LD** and **Business_AD_RM** ​
 +  * **4:30pm** Pulling sftps off the ftp+ module seems to be fixing the issues. ​ But still having issues between **OLD_KSU** to **5th_Floor** planning on fixing moving link to front sftp ports
 +  * It's starting to look like we have two bad ftp+ modules in **5th_Floor** and **Elementary_LD**
 +==== Jan 14th ====
 +  * **10am** moved other sftp from ftp+ module to front sftp ports on both **5th_Floor** and **OLD_KSU** switches ​
 +  * **3pm** still getting about 1 drops/hr on **OLD_KSU** to **Ag_Room** link and 4-6 drops a hr on **Elementary_LD** to **Business_AD** all other links are staying up
 +  * **6pm** it looks like we have a bad ftp+ module in **5th_Floor**,​ **Elementary_LD** and a failing module in **OLD_KSU** or **Ag_Room**
 +==== Jan 15th ====
 +  * **7:30am** moved other sftp from ftp+ module to front sftp ports on both **Ag_Room** and **OLD_KSU** switches ​
 +  * **8:30am** link between **Ag_Room** and **Business_AD** started dropping
 +  * **8:51am** moved back to 10G sftp+ on **OLD_KSU** for **Ag_Room** link
 +  * **9:00am** moved other sftp from ftp+ module to front sftp ports on both **Ag_Room** and **Business_AD**
 +  * **11:00am** it looks like only one link is dropping: 1:54 on **Elementary_LD**
 ===== Switch Addresses ===== ===== Switch Addresses =====
 ^Address^Name^ ^Address^Name^
Line 48: Line 78:
 show ports 1:53,1:54 rxerrors show ports 1:53,1:54 rxerrors
 show ports 1:31,2:31 rxerrors show ports 1:31,2:31 rxerrors
 +show switch
 +show version
 </​file>​ </​file>​
  
Line 80: Line 112:
 rtt min/​avg/​max/​mdev = 0.207/​1.040/​1016.357/​17.592 ms, pipe 4 rtt min/​avg/​max/​mdev = 0.207/​1.040/​1016.357/​17.592 ms, pipe 4
 </​file>​ </​file>​
- 
-===== Log ===== 
-==== Jan 12th ==== 
-  * Any fiber run that is dropping seems to have some sort of CRC errors 
-  * run from  **Ag_Room** to **OLD_KSU** giving CRC errors on **OLD_KSU** port 1:53 even after restarting both switches reconfigured both ports to 1G and CRC errors stopped 
-  * After reconfiguring both ports back to 10G the CRC errors seemed to have stopped 
-  * Run from **server room** to **5th floor** stayed down after restarting both switches. ​ Link did come up after reconfiguring both ports to 1G, both both 1:31 on server room switch and 1:54 on 5th floor switch had CRC errors 
-  * As of 8pm on Friday all other switches showing 0 CRC errors 
-  * Fibers in **Ag_Room** are transposed from the port descriptions. 
-==== Jan 13th ==== 
-  * Over the night of the 12th of Jan EAPS master reporting ​ <​Info:​EAPS.RxPduLinkDown>​ on **Office** and **OLD_KSU** switches 
-  * reduced speed of **Ag_room** and **OLD_KSU** switches to 1G seeing if CRC errors are causing ​ <​Info:​EAPS.RxPduLinkDown>​ on EAPS master 
-  * **Office** switch is having port 1:54 lose connection every hour or so, even after multiple restarts 
-  * **Elementary_LD** port 1:54 had no issues until about (over 12hrs) 7am on Jan 13th then started loosing uplink every 1-2 min over a 5-10min period. ​ Started again at 10:​30am ​ 
-  * **Elementary_LD** was restarted and port 1:54 resumed normal operation 
  
 ==== Example Rx error report ==== ==== Example Rx error report ====
network/eapstroubleshooting.txt · Last modified: 2018/01/15 12:57 by tschulz