Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: add way to filter network interfaces #277

Open
wants to merge 7 commits into
base: main
Choose a base branch
from

Conversation

erhudy
Copy link

@erhudy erhudy commented Mar 5, 2024

Description

Under certain circumstances, net.Interfaces() will begin returning interfaces in unusual orders (e.g. if the interface index goes above 255 or 256). This could cause the driver to select the wrong interface, which in turn would mean that it could not find a matching VM in vSphere and would crash. This changeset adds arguments to control what interfaces can be selected by the driver.

GitHub Issues

List the GitHub issues impacted by this PR:

GitHub Issue #

Checklist:

  • Have you run format,vet & lint checks against your submission?
  • Have you made sure that the code compiles?
  • Did you run the unit & integration tests successfully?
  • Have you maintained at least 90% code coverage?
  • Have you commented your code, particularly in hard-to-understand areas
  • Have you done corresponding changes to the documentation
  • Did you run tests in a real Kubernetes cluster?
  • Backward compatibility is not broken

How Has This Been Tested?

Built an image and deployed it in the affected Kubernetes cluster, along with changes to the Helm templates and values to surface the new environment variables so that they can be configured as needed.

@boyamurthy
Copy link
Contributor

hi @erhudy , thank you for submitting the PR . We have already passed the code freeze date for CSM 1.10 release and we will consider this for CSM 1.11 .

main.go Outdated
@@ -106,4 +106,10 @@ const usage = ` X_CSI_POWERMAX_ENDPOINT
X_CSI_POWERMAX_DEBUG
Turns on debugging of the PowerMax (REST interface to Unisphere) layer
X_CSI_IFACE_INCLUDE_FILTER
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please rename arguments to reflect these are VM interfaces.

Copy link
Contributor

@nidtara nidtara left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How has this been tested, please attach test logs to the PR

if v.HardwareAddr.String() != "" {
return v.HardwareAddr.String(), nil
if ifaceExcludeFilter != nil || ifaceIncludeFilter != nil {
if ifaceIncludeFilter != nil && ifaceIncludeFilter.MatchString(v.Name) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

elaborate on the use-case for having the ifaceIncludeFilter?
anyhow all VMs except excludefilter will be selected

@erhudy
Copy link
Author

erhudy commented Apr 30, 2024

Sorry, this fell off the radar. The general purpose of includeFilter is to get it to select a particular interface, e.g. ens192 is the interface on our VMware VMs. That being said, I think you're right that it doesn't work exactly as intended in its present form, but it could probably be simplified to allow just specifying a particular interface name and that would avoid all the dynamic interface selection stuff entirely.

As far as testing this goes, I don't think I could run the integration test suite at that time because of some reason unrelated to the PR (it was failing on the main branch as well so probably some environment problem). The PR itself has been in use at my company since I made it, in order to filter out Calico's interfaces from selection.

@erhudy
Copy link
Author

erhudy commented May 5, 2024

How has this been tested, please attach test logs to the PR

I simplified this PR by dropping IfaceIncludeFilter, so now it only excludes interfaces. Let me know what other changes you would like to see.

@boyamurthy
Copy link
Contributor

How has this been tested, please attach test logs to the PR

I simplified this PR by dropping IfaceIncludeFilter, so now it only excludes interfaces. Let me know what other changes you would like to see.

hi @erhudy, could you rebase the branch with latest main.

@erhudy erhudy force-pushed the add-interface-filters branch 2 times, most recently from a6d8bd0 to 1421bdf Compare May 7, 2024 12:54
@erhudy
Copy link
Author

erhudy commented May 7, 2024

How has this been tested, please attach test logs to the PR

I simplified this PR by dropping IfaceIncludeFilter, so now it only excludes interfaces. Let me know what other changes you would like to see.

hi @erhudy, could you rebase the branch with latest main.

Rebased off main and fixed the linting error. I built a new container image from this branch after all my changes and am using it now in an internal test cluster, with the config option X_CSI_IFACE_EXCLUDE_FILTER="^cali.+" to exclude Calico interfaces.

@boyamurthy
Copy link
Contributor

How has this been tested, please attach test logs to the PR

I simplified this PR by dropping IfaceIncludeFilter, so now it only excludes interfaces. Let me know what other changes you would like to see.

hi @erhudy, could you rebase the branch with latest main.

Rebased off main and fixed the linting error. I built a new container image from this branch after all my changes and am using it now in an internal test cluster, with the config option X_CSI_IFACE_EXCLUDE_FILTER="^cali.+" to exclude Calico interfaces.

Thank you @erhudy .

@shanmydell
Copy link
Collaborator

@nidtara @boyamurthy : Please review the latest changes

nidtara
nidtara previously approved these changes Jun 11, 2024
Copy link
Contributor

@nidtara nidtara left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

boyamurthy
boyamurthy previously approved these changes Jun 19, 2024
nidtara
nidtara previously approved these changes Jun 20, 2024
Copy link
Contributor

@nidtara nidtara left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

@delldubey
Copy link
Contributor

@erhudy , can you rebase this with the main?
Thanks.

@erhudy erhudy dismissed stale reviews from nidtara and boyamurthy via c0e2f6e June 25, 2024 23:44
@erhudy
Copy link
Author

erhudy commented Jun 25, 2024

@delldubey branch updated.

Unrelated to this branch, but if I have a crash report with an NPE, what's the best way to report that? The issues tracker on this repository is not open to me.

@delldubey
Copy link
Contributor

@erhudy , here csm/issues

Under certain circumstances, net.Interfaces() will begin
returning interfaces in unusual orders (e.g. if the interface
index goes above 255 or 256). This could cause the driver to
select the wrong interface, which in turn would mean that
it could not find a matching VM in vSphere and would crash.
This changeset adds an argument to control what interfaces can
be selected by the driver, by allowing certain interfaces to be
excluded from consideration.
Copy link
Collaborator

@shanmydell shanmydell left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This has been tested earlier and approved. LGTM

@erhudy
Copy link
Author

erhudy commented Oct 14, 2024

What is required to complete this?

If somebody is waiting for me to provide unit test output, I cannot run the unit tests locally for reasons that do not appear to have anything to do with my changes:

( cd service; go clean -cache; CGO_ENABLED=0 GO111MODULE=on go test -v -coverprofile=c.out ./... )
=== RUN   TestApiRouter2
time="2024-10-14T23:18:31Z" level=info msg="starting http server on port :abc"
time="2024-10-14T23:18:31Z" level=error msg="unable to start http server to serve status requests due to listen tcp: lookup tcp/abc: unknown port"
time="2024-10-14T23:18:31Z" level=info msg="started http server to serve status requests at :abc"
--- PASS: TestApiRouter2 (0.00s)
=== RUN   TestApiRouter
time="2024-10-14T23:18:31Z" level=info msg="starting http server on port :8083"
time="2024-10-14T23:18:33Z" level=info msg="connectivityStatus called, status is <nil> \n"
time="2024-10-14T23:18:33Z" level=error msg="error probeStatus map in cache is empty"
time="2024-10-14T23:18:33Z" level=info msg="connectivityStatus called, status is &{{0 0} {[] {} 0x400061fb00} map[SymID2:0x40000801f0] 0} \n"
time="2024-10-14T23:18:33Z" level=error msg="invalid data is stored in cache"
time="2024-10-14T23:18:33Z" level=error msg="error invalid data is stored in cache during marshaling to json"
time="2024-10-14T23:18:33Z" level=info msg="connectivityStatus called, status is &{{0 0} {[] {} 0x40003d8330} map[SymID:0x40003de078] 0} \n"
time="2024-10-14T23:18:33Z" level=info msg="sending connectivityStatus for all arrays "
time="2024-10-14T23:18:33Z" level=info msg="GetArrayConnectivityStatus called for array SymIDNotPresent \n"
time="2024-10-14T23:18:33Z" level=info msg="GetArrayConnectivityStatus called for array SymID3 \n"
time="2024-10-14T23:18:33Z" level=error msg="error json: unsupported type: chan int during marshaling to json"
time="2024-10-14T23:18:33Z" level=info msg="GetArrayConnectivityStatus called for array SymID \n"
time="2024-10-14T23:18:33Z" level=info msg="sending response {LastSuccess:1728947913 LastAttempt:1728947913} for array SymID \n"
--- PASS: TestApiRouter (2.00s)
=== RUN   TestMarshalSyncMapToJSON
=== RUN   TestMarshalSyncMapToJSON/storing_valid_value_in_map_cache
=== RUN   TestMarshalSyncMapToJSON/storing_valid_value_in_map_cache#01
time="2024-10-14T23:18:33Z" level=error msg="invalid data is stored in cache"
--- PASS: TestMarshalSyncMapToJSON (0.00s)
    --- PASS: TestMarshalSyncMapToJSON/storing_valid_value_in_map_cache (0.00s)
    --- PASS: TestMarshalSyncMapToJSON/storing_valid_value_in_map_cache#01 (0.00s)
=== RUN   TestStartAPIService
time="2024-10-14T23:18:33Z" level=info msg="startNodeToArrayConnectivityCheck is running probes at pollingFrequency 30 "
time="2024-10-14T23:18:33Z" level=info msg="starting http server on port :8083"
time="2024-10-14T23:18:33Z" level=error msg="unable to start http server to serve status requests due to listen tcp :8083: bind: address already in use"
time="2024-10-14T23:18:33Z" level=info msg="started http server to serve status requests at :8083"
--- PASS: TestStartAPIService (0.00s)
=== RUN   TestStartAPIServiceNoPodmon
time="2024-10-14T23:18:33Z" level=info msg="podmon is not enabled"
--- PASS: TestStartAPIServiceNoPodmon (0.00s)
=== RUN   TestGoDog
starting godog...
Feature: PowerMax CSI interface
  As a consumer of the CSI interface
  I want to test controller publish / unpublish interfaces
  So that they are known to work
startup time: 2.097824036s
goroutines start aPowerMaxService new 16 old groutines 0
&{CSI-Test-Node-1 0 1 0 false false   iSCSI [iqn.1993-08.org.centos:01:5ae577b352a0] [] [] 0 0}
&{CSI-Test-Node-2 0 2 0 false false   iSCSI [iqn.1993-08.org.centos:01:5ae577b352a1 iqn.1993-08.org.centos:01:5ae577b352a2] [] [] 0 0}
&{CSI-Test-Node-3-FC 0 2 0 false false   Fibre [20000090fa9278dd 20000090fa9278dc] [] [] 0 0}
*****added** &{DEL-snapshot-1 0 false false 719899255 Established}*******Total Snaps on 00001 are: 1*********added** &{DEL-snapshot-2 0 false false 719916755 Established}*******Total Snaps on 00002 are: 1****time="2024-10-14T23:18:33Z" level=info msg="server url: http://127.0.0.1:40423"
time="2024-10-14T23:18:33Z" level=info msg="Couldn't read ../../gopowermax/mock/symmetrix46.json"
time="2024-10-14T23:18:33Z" level=error msg="Failed to fetch details for array: 000197900046. [Not Found]"
time="2024-10-14T23:18:33Z" level=info msg="Couldn't read ../../gopowermax/mock/symmetrix47.json"
time="2024-10-14T23:18:33Z" level=error msg="Failed to fetch details for array: 000197900047. [Not Found]"
time="2024-10-14T23:18:33Z" level=info msg="Couldn't read ../../gopowermax/mock/symmetrix13.json"
time="2024-10-14T23:18:33Z" level=error msg="Failed to fetch details for array: 000000000013. [Not Found]"
time="2024-10-14T23:18:33Z" level=error msg="None of the managed arrays specified are locally connected"
FAIL	github.com/dell/csi-powermax/v2/service	2.149s
FAIL
make: *** [Makefile:64: unit-test] Error 1

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants