User Tools

Site Tools


network:eapstroubleshooting

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
Next revision
Previous revision
network:eapstroubleshooting [2018/01/13 12:12]
tschulz [Jan 12th]
network:eapstroubleshooting [2018/01/15 12:57] (current)
tschulz [Jan 15th]
Line 1: Line 1:
 ====== EAPS Troubleshooting ====== ====== EAPS Troubleshooting ======
  
 +===== Log =====
 +==== Jan 12th ====
 +  * After restarting both **Server_Room** and **5th_Floor** I could not get link between the switches to come up on 10G reducing speed to 1G brings the link up but produces a log CRC errors and the connection is drop once every 10-20 secs even when using 1G sftps
 +  * Any fiber run that is dropping seems to have some sort of CRC errors
 +  * run from  **Ag_Room** to **OLD_KSU** giving CRC errors on **OLD_KSU** port 1:53 even after restarting both switches reconfigured both ports to 1G and CRC errors stopped
 +  * After reconfiguring both ports back to 10G the CRC errors seemed to have stopped
 +  * Run from **server room** to **5th floor** stayed down after restarting both switches. ​ Link did come up after reconfiguring both ports to 1G, both both 1:31 on server room switch and 1:54 on 5th floor switch had CRC errors
 +  * As of 8pm on Friday all other switches showing 0 CRC errors
 +  * Fibers in **Ag_Room** are transposed from the port descriptions.
 +==== Jan 13th ====
 +  * Over the night of the 12th of Jan EAPS master reporting ​ <​Info:​EAPS.RxPduLinkDown>​ on **Office** and **OLD_KSU** switches
 +  * reduced speed of **Ag_room** and **OLD_KSU** switches to 1G seeing if CRC errors are causing ​ <​Info:​EAPS.RxPduLinkDown>​ on EAPS master
 +  * **Office** switch is having port 1:54 lose connection every hour or so, even after multiple restarts
 +  * **Elementary_LD** port 1:54 had no issues until about (over 12hrs) 7am on Jan 13th then started loosing uplink every 1-2 min over a 5-10min period. ​ Started again at 10:​30am ​
 +  * **Elementary_LD** was restarted and port 1:54 resumed normal operation
 +  * Switched **Office** to **Elementary_LD** and **5th_Floor** to **Server_Room** over to using 1G sftp on the front of each switch and reconfigured EAPS ring to use different ports as of **12:01pm**
 +  * single mode fiber link between **Elementary_LD** to **Business_AD_RM** started dropping a lot reducing speed to 1G to see if that stops the dropping **12:20pm**
 +  * **12:25pm** After reducing both links to 1G only one side would get a connection reverted to 10G and rebooted **Elementary_LD** and **Business_AD_RM** ​
 +  * **4:30pm** Pulling sftps off the ftp+ module seems to be fixing the issues. ​ But still having issues between **OLD_KSU** to **5th_Floor** planning on fixing moving link to front sftp ports
 +  * It's starting to look like we have two bad ftp+ modules in **5th_Floor** and **Elementary_LD**
 +==== Jan 14th ====
 +  * **10am** moved other sftp from ftp+ module to front sftp ports on both **5th_Floor** and **OLD_KSU** switches ​
 +  * **3pm** still getting about 1 drops/hr on **OLD_KSU** to **Ag_Room** link and 4-6 drops a hr on **Elementary_LD** to **Business_AD** all other links are staying up
 +  * **6pm** it looks like we have a bad ftp+ module in **5th_Floor**,​ **Elementary_LD** and a failing module in **OLD_KSU** or **Ag_Room**
 +==== Jan 15th ====
 +  * **7:30am** moved other sftp from ftp+ module to front sftp ports on both **Ag_Room** and **OLD_KSU** switches ​
 +  * **8:30am** link between **Ag_Room** and **Business_AD** started dropping
 +  * **8:51am** moved back to 10G sftp+ on **OLD_KSU** for **Ag_Room** link
 +  * **9:00am** moved other sftp from ftp+ module to front sftp ports on both **Ag_Room** and **Business_AD**
 +  * **11:00am** it looks like only one link is dropping: 1:54 on **Elementary_LD**
 +  * **11:30am** replaced 10G card in **Elementary_LD**
 +  * **1:00pm** only one drop at old **KSU_ROOM** at 12:35pm
 ===== Switch Addresses ===== ===== Switch Addresses =====
 ^Address^Name^ ^Address^Name^
Line 48: Line 80:
 show ports 1:53,1:54 rxerrors show ports 1:53,1:54 rxerrors
 show ports 1:31,2:31 rxerrors show ports 1:31,2:31 rxerrors
 +show switch
 +show version
 </​file>​ </​file>​
  
Line 80: Line 114:
 rtt min/​avg/​max/​mdev = 0.207/​1.040/​1016.357/​17.592 ms, pipe 4 rtt min/​avg/​max/​mdev = 0.207/​1.040/​1016.357/​17.592 ms, pipe 4
 </​file>​ </​file>​
- 
-===== Log ===== 
-==== Jan 12th ==== 
-  * After restarting both **Server_Room** and **5th_Floor** I could not get link between the switches to come up on 10G reducing speed to 1G brings the link up but produces a log CRC errors and the connection is drop once every 10-20 secs even when using 1G sftps 
-  * Any fiber run that is dropping seems to have some sort of CRC errors 
-  * run from  **Ag_Room** to **OLD_KSU** giving CRC errors on **OLD_KSU** port 1:53 even after restarting both switches reconfigured both ports to 1G and CRC errors stopped 
-  * After reconfiguring both ports back to 10G the CRC errors seemed to have stopped 
-  * Run from **server room** to **5th floor** stayed down after restarting both switches. ​ Link did come up after reconfiguring both ports to 1G, both both 1:31 on server room switch and 1:54 on 5th floor switch had CRC errors 
-  * As of 8pm on Friday all other switches showing 0 CRC errors 
-  * Fibers in **Ag_Room** are transposed from the port descriptions. 
-==== Jan 13th ==== 
-  * Over the night of the 12th of Jan EAPS master reporting ​ <​Info:​EAPS.RxPduLinkDown>​ on **Office** and **OLD_KSU** switches 
-  * reduced speed of **Ag_room** and **OLD_KSU** switches to 1G seeing if CRC errors are causing ​ <​Info:​EAPS.RxPduLinkDown>​ on EAPS master 
-  * **Office** switch is having port 1:54 lose connection every hour or so, even after multiple restarts 
-  * **Elementary_LD** port 1:54 had no issues until about (over 12hrs) 7am on Jan 13th then started loosing uplink every 1-2 min over a 5-10min period. ​ Started again at 10:​30am ​ 
-  * **Elementary_LD** was restarted and port 1:54 resumed normal operation 
-  * Switched **Office** to **Elementary_LD** and **5th_Floor** to **Server_Room** over to using 1G sftp on the front of each switch and reconfigured EAPS ring to use different ports as of 12:01pm 
-  *  
  
 ==== Example Rx error report ==== ==== Example Rx error report ====
network/eapstroubleshooting.1515867121.txt.gz · Last modified: 2018/01/13 12:12 by tschulz