Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Warning and Errors when performing Proxmox provisioning #177

Open
markjhunsinger opened this issue Jan 12, 2024 · 26 comments
Open

Warning and Errors when performing Proxmox provisioning #177

markjhunsinger opened this issue Jan 12, 2024 · 26 comments
Labels

Comments

@markjhunsinger
Copy link

Hello!

I've been following the blog post on installing GOAD on Proxmox and have run into a few hiccups along the way, but can't quite figure out this last one. I'm on Part 4 of the walkthrough - Run the playbook.

I'm seeing a warning that there is an error when collecting bios/platform/processor facts related to "Failed to get SMBIOS buffer information (Incorrect function...)". I've included a screenshot below.

gathering-facts

The script continues, and then I'm seeing a bunch of fatal errors seemingly related to NuGet. Another screenshot below.

task-common

play-recap

I've already tried removing the VMs in Proxmox and rebuilding them using Terraform, but I get the same result. I've also tried different versions of Ansible with no luck.

Any suggestions are appreciated!

@Mayfly277
Copy link
Collaborator

This error is because you didn't have internet on the vms. verify pfsense can resolve dns and verify pfsense allow the goad vlan to contact internet.

@markjhunsinger
Copy link
Author

It seems like pfsense can resolve DNS and the firewall rules for VLAN10 are set correctly.

DNS Lookup:
dns-lookup

Firewall rules for VLAN10:
fw-vlan10

But you are right - the VMs do not have internet access, and I'm not quite sure why. Both the WAN and LAN devices (Proxmox, pfsense, and provisioning) all have internet.

Any insight into my firewall rule or anything else I can check?

@markjhunsinger
Copy link
Author

After a few Proxmox reboots (or any number of changes I've made in pfsense in the last couple of days, who can say), the VMs now have internet access.

I still got the Gathering Facts warnings, but the script seems to be progressing fine so far despite them.

I will post an update shortly!

@GabrielKrueger
Copy link

After a few Proxmox reboots (or any number of changes I've made in pfsense in the last couple of days, who can say), the VMs now have internet access.

I still got the Gathering Facts warnings, but the script seems to be progressing fine so far despite them.

I will post an update shortly!

Any update on this? Facing the exact same issue.

@markjhunsinger
Copy link
Author

After a few Proxmox reboots (or any number of changes I've made in pfsense in the last couple of days, who can say), the VMs now have internet access.
I still got the Gathering Facts warnings, but the script seems to be progressing fine so far despite them.
I will post an update shortly!

Any update on this? Facing the exact same issue.

Still having some issues with VMs connecting to the internet. Destroyed and rebuilt some of the VMs with terraform, and they will connect to the internet for a short while, but then they'll stop working again out of nowhere.

I got to a point where all VMs were connected to the internet, but as soon as I ran the Ansible provisioning script, the connection died again. I'm going to be messing with it more today to see if I can figure out what's going on.

@markjhunsinger
Copy link
Author

I switched all the GOAD VMs back to virtIO network devices, and they are all connected now. I am rerunning the provisioning script at the moment and provide an update with the results.

@markjhunsinger
Copy link
Author

Provisioning does not work for me when the interfaces are VirtIO, so I switched them back to Intel.

Still having issues with DC02 getting an internet connection. I believe it has to do with DHCP, since the other four servers have DHCP leases in pfSense, but DC02 does not for some reason.

@GabrielKrueger
Copy link

The problem seems to be that the machines inside the VLAN10 (192.168.10.X) got the gateway set as 192.168.10.1 but it cannot be reached (no ping possible)?

@markjhunsinger
Copy link
Author

After it all, I still wasn't able to get one of the VMs to connect to the internet, so I ended up starting fresh and installed Proxmox 7.4 in hopes to avoid the Terraform provider issues (so far so good).

My issue once again is no internet access on any of the VMs. pfSense can resolve DNS and the firewall rules are in place to allow internet to the VLANs, so I'm not quite sure what the issue is. Last time it seemingly ended up correcting itself, but no luck so far.

Going to continue poking at things and see what sticks.

@markjhunsinger
Copy link
Author

markjhunsinger commented Feb 5, 2024

I've been messing with this and believe I have discovered the issue with internet access on the VMs.

According to the guide, this is what the VLAN10 firewall should look like:

image

The INTERNAL alias we set up includes the following Networks:

192.168.1.1/16 (LAN + VLAN)
10.0.0.1/30 (WAN)
10.10.10.0/24 (VPN)

Although the rule does technically allow communication with the internet, there is no explicit rule allowing communication to the gateway, Please correct me if I'm wrong, but it looks like the VLAN10 firewall needs some additional rules.

If I set it up like this, the hosts can connect to the internet.

image

@navees1
Copy link

navees1 commented Feb 19, 2024

I'm having the same problem, I've tried everything, but I can't solve it.

@chuckjorrit
Copy link

I think i've found the problem at least, I have it working on my proxmox. What I found was that in the packer config the network card is a "virto" NIC, so this gets in the template. When you deploy the default terraform recipe the NIC is a "e1000" card. so what happen is (at least om my machine) is that the NIC gets renamed to "Ethernet 2". So you might think, whatever... well the ansible playbook is looking to configure the NIC "Ethernet", but as mention prio that one is not connected so this causes ansible to fail.

What I recommend is, log in one of your systems and check what the name is of the NIC (username: varagnt, PW: vagrant) BTW don't forget too set the keyboard settings to eng because, the default is Fr. Thus, to verify that you have this issue, connect the provisioning CT to the 192.168.10.0/24 network, and ping the default gateway.

The way I solved this is, to change the NICs in the terrafrom recipe to "virto" and redeploy the VMs. This is in my opinon the quickest solution.

@markjhunsinger
Copy link
Author

Setting them to virtio certainly gives you Internet, but according to the blog post, you will have issues with the machines connecting to the domain when running the playbook. I had this issue myself so I ended up using the e1000 card as suggested. If it works for you, that's great!

My main issue with the internet was that the VLAN VMs were unable to communicate with the gateway (logging into the Vagrant account confirmed this was the case). I still can't quite put my finger on what fixed it for me, but messing with the VLAN firewall rules seemed to do the trick.

I was able to finish everything else up successfully and have a working GOAD now.

@navees1
Copy link

navees1 commented Feb 19, 2024

Can you send me a printout with the pfsense rules for your vlan?

@navees1
Copy link

navees1 commented Feb 19, 2024

image

I created these rules, and I managed to solve the problem

@chuckjorrit
Copy link

chuckjorrit commented Feb 19, 2024

I suggest Setting the protocol to any. that should solve it I think. Or make a rule for all the protocols needed. But I suggest the first option.

@markjhunsinger
Copy link
Author

Here are my current VLAN rules:

image

But as @chuckjorrit mentioned above, you should add a rule at the top to allow all traffic until your VMs can connect to the gateway/Internet. Then you should be able to run the provisioning script without any issue (hopefully). After everything was connecting for me, I removed the "allow any" rule, and it still seems to work fine for me. Don't ask me why.

@aancw
Copy link

aancw commented Mar 20, 2024

Hi @markjhunsinger I tried your method to add any to any rule in VLAN10 and now i can ping 8.8.8.8(has internet). But still can't ping to google.com and getting error when Upgrade module PowerShellGet to fix accept license issue on last windows ansible version. Do you have any idea why this can't resolv domain?

Update:
Change dns_server=192.168.10.1 to dns_server=8.8.8.8 in ../ad/GOAD/providers/proxmox/inventorysolve the problem. Is it safe way to do it?

@bdesforges
Copy link

bdesforges commented Apr 4, 2024

I also get a similar behaviour when the VMs are provisionned.
In my goad.tf file, the VMs are configured with dns = "192.168.10.1"
In ../ad/GOAD/providers/proxmox/inventory my I also have : dns_server=192.168.10.1

But when I look at the network configuration, which, as pointed out by @chuckjorrit , shows an interface name of "Ethernet 2", there is no DNS server present.
image

Once I manually add the dns server to 192.168.10.1, I have Internet on the host.

EDIT: this might be my misunderstanding, as it might be ansible's job to enforce the DNS in part 4, as opposed to ie being configured in part 3 as part of the provisionning.

@BerSecHub
Copy link

Hi @markjhunsinger I tried your method to add any to any rule in VLAN10 and now i can ping 8.8.8.8(has internet). But still can't ping to google.com and getting error when Upgrade module PowerShellGet to fix accept license issue on last windows ansible version. Do you have any idea why this can't resolv domain?

Update: Change dns_server=192.168.10.1 to dns_server=8.8.8.8 in ../ad/GOAD/providers/proxmox/inventorysolve the problem. Is it safe way to do it?

Did you change this DNS after?

@aancw
Copy link

aancw commented Apr 20, 2024

Hi @markjhunsinger I tried your method to add any to any rule in VLAN10 and now i can ping 8.8.8.8(has internet). But still can't ping to google.com and getting error when Upgrade module PowerShellGet to fix accept license issue on last windows ansible version. Do you have any idea why this can't resolv domain?
Update: Change dns_server=192.168.10.1 to dns_server=8.8.8.8 in ../ad/GOAD/providers/proxmox/inventorysolve the problem. Is it safe way to do it?

Did you change this DNS after?

Yes I change the dns_server to 8.8.8.8

@e-fin
Copy link

e-fin commented Apr 22, 2024

I had the same problem and it was also related to DNS resolution, there is a step missed in the setup guide when configuring PFSense

When doing an nslookup from any host to pfsense I was getting "Query Refused" so I configured an access list on the DNS Resolver in pfsense and it fixed it.

Picture below is how I configured it, just made an allow rule for the 192.168.0.0/16 subnet

Screenshot 2024-04-22 at 6 19 13 PM

hope this helps others, I was having the same issue as OP.

@aancw
Copy link

aancw commented Apr 23, 2024

I had the same problem and it was also related to DNS resolution, there is a step missed in the setup guide when configuring PFSense

When doing an nslookup from any host to pfsense I was getting "Query Refused" so I configured an access list on the DNS Resolver in pfsense and it fixed it.

Picture below is how I configured it, just made an allow rule for the 192.168.0.0/16 subnet

Screenshot 2024-04-22 at 6 19 13 PM

hope this helps others, I was having the same issue as OP.

I will try to change the dns_server in DC-X and SRV to 192.168.10.1 again and try your solution. Will update if i can ping the internet

UPDATE:
I can't connect to internet when disable the any to any rule for VLAN10 and enable Access List for DNS Resolver

Pfsense disable any to any VLAN10 rule
Pfsense enable dns resolve access list
ping from machine

@e-fin
Copy link

e-fin commented Apr 23, 2024

I had the same problem and it was also related to DNS resolution, there is a step missed in the setup guide when configuring PFSense
When doing an nslookup from any host to pfsense I was getting "Query Refused" so I configured an access list on the DNS Resolver in pfsense and it fixed it.
Picture below is how I configured it, just made an allow rule for the 192.168.0.0/16 subnet
Screenshot 2024-04-22 at 6 19 13 PM
hope this helps others, I was having the same issue as OP.

I will try to change the dns_server in DC-X and SRV to 192.168.10.1 again and try your solution. Will update if i can ping the internet

UPDATE: I can't connect to internet when disable the any to any rule for VLAN10 and enable Access List for DNS Resolver

Pfsense disable any to any VLAN10 rule Pfsense enable dns resolve access list ping from machine

Try adding a firewall rule that allows "ANY" protocol to allow ICMP, looks like your rules only allow TCP and UDP. i wasnt able to ping until i added an ICMP rule as well.

Also looks like you need a firewall rule to allow traffic to the internet, that ANY ANY rule seems to be acting as that rule.

It looks like google was able to resolve so your DNS issue appears to be resolved.

@aancw
Copy link

aancw commented Apr 25, 2024

Try adding a firewall rule that allows "ANY" protocol to allow ICMP, looks like your rules only allow TCP and UDP. i wasnt able to ping until i added an ICMP rule as well.

Will try that suggestion.

Also looks like you need a firewall rule to allow traffic to the internet, that ANY ANY rule seems to be acting as that rule.

Is it necessary to add ANY ANY in VLAN10? I think ANY ANY is already allowed in WAN rules for internet traffic.

It looks like google was able to resolve so your DNS issue appears to be resolved.

You're right, the dns resolved.

@h4ckd0tm3
Copy link

Similar Issue.
Testing with pfctl -d so I don't think it's a firewall issue. I also had issues with the provisioning VM getting internet I solved that by adding an IPTables rule: iptables -t nat -A POSTROUTING -o vmbr0 -s 192.168.1.0/24 -j MASQUERADE but not really sure what to do here...

Tracert shows it's stuck on the first hop (192.168.10.1)

image

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

10 participants