Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[rfc2737]: Handle unicode error when parsing transceiver #235

Merged
merged 5 commits into from
Mar 1, 2022

Conversation

SuvarnaMeenakshi
Copy link
Contributor

data.

Signed-off-by: Suvarna Meenakshi [email protected]

- What I did
If there is an issue in transceiver data and some junk characters are present in transceiver data, then SNMP does not parse this information and transceiver MIB cannot be queried.
Error seen in such scenario:

show interface transceiver eeprom
...
Ethernet5: SFP EEPROM detected
        Connector: Unknown
        Encoding: Unknown
        Extended Identifier: Unknown
        Extended RateSelect Compliance: Unknown
        Identifier: Unknown
        Length Cable Assembly(m): 255
        Length OM1(m): 255
        Length OM2(m): 255
        Length OM3(2m): 255
        Length(km): 255
        Nominal Bit Rate(100Mbs): 255
        Specification compliance:
                10/40G Ethernet Compliance Code: 10GBase-LR
                Fibre Channel Speed: 100 Mbytes/Sec
                Fibre Channel link length/Transmitter Technology: AAA
                Fibre Channel transmission media: AAA
                Gigabit Ethernet Compliant codes: 1000BASE-CX
                SAS/SATA compliance codes: AAA
                SONET Compliance codes: AAA
        Vendor Date Code(YYYY-MM-DD Lot): 20ÿÿ-ÿÿ-ÿÿ ÿÿ
        Vendor Name: ÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿ
        Vendor OUI: AAA
        Vendor PN: ÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿ
        Vendor Rev: ÿÿ
        Vendor SN: ÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿ

Error log in syslog:
INFO snmp#supervisord: snmp-subagent ERROR:ax_interface:MIBUpdater.start() caught an unexpected exception during update_data()
INFO snmp#supervisord: snmp-subagent Traceback (most recent call last):
INFO snmp#supervisord: snmp-subagent   File "/usr/local/lib/python3.6/dist-packages/ax_interface/mib.py", line 40, in start
INFO snmp#supervisord: snmp-subagent     self.reinit_data()
INFO snmp#supervisord: snmp-subagent   File "/usr/local/lib/python3.6/dist-packages/sonic_ax_impl/mibs/ietf/rfc2737.py", line 192, in reinit_data
INFO snmp#supervisord: snmp-subagent     self._update_transceiver_cache(interface)
INFO snmp#supervisord: snmp-subagent   File "/usr/local/lib/python3.6/dist-packages/sonic_ax_impl/mibs/ietf/rfc2737.py", line 276, in _update_transceiver_cache
INFO snmp#supervisord: snmp-subagent     self.physical_model_name_map[sub_id] = get_transceiver_data(transceiver_info)
INFO snmp#supervisord: snmp-subagent   File "/usr/local/lib/python3.6/dist-packages/sonic_ax_impl/mibs/ietf/rfc2737.py", line 76, in <genexpr>
INFO snmp#supervisord: snmp-subagent     for xcvr_field in XcvrInfoDB)

Because of this xcvr MIB does not provide expected output:
iso.3.6.1.2.1.47.1.1.1.1.13.5000 = No Such Instance currently exists at this OID

To avoid seeing this error message and to retrieve the transceiver information that is available in SNMP output, this fix is made.

- How I did it
Handle unicode error to handle parsing error seen in snmp_ax_impl.

- How to verify it
In the device where the above error was seen, fix was made and tested.

  • No error message in syslog.
  • Able to retrieve xcvr information using OID: iso.3.6.1.2.1.47.1.1.1.1

- Description for the changelog

@qiluo-msft
Copy link
Contributor

qiluo-msft commented Sep 29, 2021

Add a unit test?


In reply to: 929771195

@qiluo-msft
Copy link
Contributor

Could you double check the impact of similar issue on later branches including master?

@SuvarnaMeenakshi
Copy link
Contributor Author

Add a unit test?

Added a unit test with mock junk string in DB

Signed-off-by: Suvarna Meenakshi <[email protected]>
@SuvarnaMeenakshi
Copy link
Contributor Author

Could you double check the impact of similar issue on later branches including master?

It has been hard to reproduce the exact scenario of having corrupted string in the STATE_DB and SNMP trying to decode it.
In branches 202012 and above there are changes done in db interface APIs, so SNMP does not decode the string after getting it from DB.
In one device where the above SNMP decode error is seen, STATE_DB contains data:
5) "serialnum"
6) "\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff"
In the same device, the new image (202012) does not not show the corrupted string in the db. There could be couple of reasons for it:

  1. Platform Transceiver data is fixed and platform APIs provide the right data.
  2. When transceiver data is written to the db, the db interface APIs is storing the correct string in the DB.
    Because of this, SNMP is able to fetch a good string from STATE_DB.
    Also, SNMP agent does not decode string in 202012, master branches. So the fix in this PR will not hold good in later branches.

@@ -24,6 +24,13 @@
"manufacturer": "VENDOR_NAME",
"model": "MODEL_NAME"
},
"TRANSCEIVER_INFO|Ethernet4": {
"type": "QSFP+",
"hardware_rev": "\xff\xff",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

\xff\xff

Let's add another test case \xdcff\xdcff to trigger another bug we found during debugging.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Raised a github issue , where the bad string can cause agentx crash in 2021/master branch.
#239.
This issue will be seen on branches >= 2020x .

@SuvarnaMeenakshi
Copy link
Contributor Author

  • iso.3.6.1.2.1.47.1.1.1.1

Could you double check the impact of similar issue on later branches including master?

This issue will be seen in 20911 or 201811 branches.
After these branches, the decode() of string is not done in the AgentX, so this specific error will not be seen.
The other issue in the newer branches is logged in: #239

@SuvarnaMeenakshi
Copy link
Contributor Author

Various options to fix this:

  1. Handle UnicodeError and return empty string in case of UnicodeError.
  2. Ignore any errors seen while decoding. Ex: decode(encoding='UTF-8',errors='ignore'), this will result in empty string without logging any UnicodeError.
  3. use an less strict encoding like latin-1. If we use latin-1, the result looks like this:
    iso.3.6.1.2.1.47.1.1.1.1.13.253000 = Hex-STRING: 46 FF 42 FF 34 FF 30 FF 42 FF 43 FF 35 FF 20 FF - Can this be handled by services using this output? The OID definition says that this output should be "SnmpString" but this results in hex string.

qiluo-msft
qiluo-msft previously approved these changes Dec 31, 2021
@SuvarnaMeenakshi SuvarnaMeenakshi dismissed qiluo-msft’s stale review January 13, 2022 23:41

Requires change in unit-test.

@SuvarnaMeenakshi SuvarnaMeenakshi merged commit 214378c into sonic-net:201911 Mar 1, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants