Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Invalid response #211

Closed
arthurvi opened this issue May 25, 2017 · 31 comments
Closed

Invalid response #211

arthurvi opened this issue May 25, 2017 · 31 comments
Labels
kind/failing-authorization Issue concerning failing ACME challenge

Comments

@arthurvi
Copy link

Certificates are not renewing. I get the error: CA marked some of the authorizations as invalid.
When I look at the logs, I see that the result is unexpected by Letsencrypt. When I look at my custom server, behind the nginx proxy, I can see the incoming requests for .well-known/acme-challenge.

This should not happen right? Nginx should handle the .well-known/acme-challenge request and not pass it to the server behind nginx? How can I prevent this and let Nginx do the .well-known/acme-challenge so my certificates renew automatically.

@lounagen
Copy link
Contributor

@arthurvi, if you haven't done it for the last month, you should docker pull your letsencrypt companion image.
The fix #192 for the exact same issue was merged and released last month

@hamishfagg
Copy link

I'm on the latest version and I am having exactly the same issue. Letsencrypt challenge requests are passed onto the container behind the proxy, which results in a 404 being passed back to LE.

@buchdag
Copy link
Member

buchdag commented May 28, 2017

I'm having this too BUT right after that I get another request that nginx handles correctly and the validation ends up succeeding despite the "CA marked some of the authorizations as invalid." warning. I don't get what's going on at all.

I was already using a version of the container including #192 when I noticed this behavior.

@arthurvi
Copy link
Author

I am running the latest version but the problem still exists. The log on letsencrypt keeps saying "Invalid response" because my API server is handling the request, not the Nginx prox in front. Anything I can try?

@markhaasjes
Copy link

I'm facing the same issue. I'm able to go to the link itself but validation fails.

ERROR:acme.challenges:311: Unable to reach http://mywebsite.com/.well-known/acme-challenge/longHashDHz80NFyKUEM24Z4: HTTPConnectionPool(host='mywebsite.com', port=80): Max retries exceeded with url: /.well-known/acme-challenge/longHashDHz80NFyKUEM24Z4 (Caused by NewConnectionError('<requests.packages.urllib3.connection.HTTPConnection object at 0x7f3ae0cf4350>: Failed to establish a new connection: [Errno 99] Address not available',))

@hamishfagg
Copy link

hamishfagg commented Jun 7, 2017

I seem to have solved my issue by removing my certs folder and the container and letting the letsencrypt companion start from scratch.

Also make sure you have the vhosts folder mounted on nginx-proxy =)

@helderco
Copy link

That only delays the problem, when those new certificates need to be renewed again.

Any solution yet?

@helderco
Copy link

Removing the certificates doesn't help me. It creates empty folders for each domain, same validation error. I have a certificate that expires tomorrow, what to do?

@buchdag
Copy link
Member

buchdag commented Jun 14, 2017

Sounds like misconfiguration or outdated container somewhere, one of my own production setups correctly renewed two certificates the past two weeks, and correctly generated a new one for testing purpose just now.

nginx-proxy-le | 2017-06-14 22:07:26,126:INFO:requests.packages.urllib3.connectionpool:756: Starting new HTTPS connection (1): letsencrypt.org
nginx-proxy-le | 2017-06-14 22:07:27,307:INFO:requests.packages.urllib3.connectionpool:207: Starting new HTTP connection (1): test.abcde.com
nginx-proxy    | test.abcde.com 172.18.0.1 - - [14/Jun/2017:22:07:27 +0000] "GET /.well-known/acme-challenge/FzjWnaKV8NJLeseNr8AyH0Ky3R0wWMt49qUa5L62QTA HTTP/1.1" 200 87 "-" "python-requests/2.8.1"
nginx-proxy-le | 2017-06-14 22:07:27,328:INFO:simp_le:1305: test.abcde.com was successfully self-verified
nginx-proxy-le | 2017-06-14 22:07:27,573:INFO:simp_le:1313: Generating new certificate private key
nginx-proxy    | test.abcde.com 66.133.109.36 - - [14/Jun/2017:22:07:27 +0000] "GET /.well-known/acme-challenge/FzjWnaKV8NJLeseNr8AyH0Ky3R0wWMt49qUa5L62QTA HTTP/1.1" 200 87 "-" "Mozilla/5.0 (compatible; Let's Encrypt validation server; +https://www.letsencrypt.org)"
nginx-proxy-le | 2017-06-14 22:07:28,954:INFO:simp_le:391: Saving account_key.json
nginx-proxy-le | 2017-06-14 22:07:28,956:INFO:simp_le:391: Saving key.pem
nginx-proxy-le | 2017-06-14 22:07:28,957:INFO:simp_le:391: Saving chain.pem
nginx-proxy-le | 2017-06-14 22:07:28,958:INFO:simp_le:391: Saving fullchain.pem
nginx-proxy-le | 2017-06-14 22:07:28,958:INFO:simp_le:391: Saving cert.pem
nginx-proxy-le | Reloading nginx docker-gen (using separate container nginx-proxy-gen)...

No LE challenge request passed to the proxyed container anymore, I have no idea why I got that last month on another server, I probably had a configuration issue myself that I don't even remember fixing.

Could you tell us more about how you run the nginx-proxy + letsencrypt-companion containers ?

@helderco
Copy link

I use docker cloud, with the following stack:

letsencrypt:
  image: 'jrcs/letsencrypt-nginx-proxy-companion:latest'
  restart: always
  volumes:
    - '/var/run/docker.sock:/var/run/docker.sock:ro'
  volumes_from:
    - nginx
nginx:
  image: 'jwilder/nginx-proxy:alpine'
  labels:
    com.github.jrcs.letsencrypt_nginx_proxy_companion.nginx_proxy: 'true'
  ports:
    - '80:80'
    - '443:443'
  restart: always
  volumes:
    - '/root/nginx/certs:/etc/nginx/certs'
    - '/root/nginx/conf.d:/etc/nginx/conf.d'
    - '/root/nginx/vhost.d:/etc/nginx/vhost.d'
    - '/root/docker-gen/nginx.tmpl:/app/nginx.tmpl:ro'
    - '/var/run/docker.sock:/tmp/docker.sock:ro'
    - '/apps/letsencrypt:/usr/share/nginx/html'
    - '/apps:/var/www'
    - '/root/nginx/htpasswd:/etc/nginx/htpasswd'

@buchdag
Copy link
Member

buchdag commented Jun 15, 2017

Unfortunately I am totally unfamiliar both with docker cloud and with the single container approach to nginx-proxy, so I don't think I'll be able to help you troubleshoot much. If it can be of any help, here is my working docker-compose file:

version: '3'

services:
  nginx:
    image: nginx:1.13.1
    container_name: nginx-proxy
    ports:
      - "80:80"
      - "443:443"
    volumes:
      - conf:/etc/nginx/conf.d
      - vhost:/etc/nginx/vhost.d
      - html:/usr/share/nginx/html
      - certs:/etc/nginx/certs
    labels:
      - "com.github.jrcs.letsencrypt_nginx_proxy_companion.nginx_proxy=true"

  dockergen:
    image: jwilder/docker-gen:0.7.3
    container_name: nginx-proxy-gen
    depends_on:
      - nginx
    command: -notify-sighup nginx-proxy -watch -wait 5s:30s /etc/docker-gen/templates/nginx.tmpl /etc/nginx/conf.d/default.conf
    volumes:
      - conf:/etc/nginx/conf.d
      - vhost:/etc/nginx/vhost.d
      - html:/usr/share/nginx/html
      - certs:/etc/nginx/certs
      - /var/run/docker.sock:/tmp/docker.sock:ro
      - ./nginx.tmpl:/etc/docker-gen/templates/nginx.tmpl:ro
  
  letsencrypt:
    image: jrcs/letsencrypt-nginx-proxy-companion
    container_name: nginx-proxy-le
    depends_on:
      - nginx
      - dockergen
    environment:
      NGINX_PROXY_CONTAINER: nginx-proxy
      NGINX_DOCKER_GEN_CONTAINER: nginx-proxy-gen
    volumes:
      - conf:/etc/nginx/conf.d
      - vhost:/etc/nginx/vhost.d
      - html:/usr/share/nginx/html
      - certs:/etc/nginx/certs
      - /var/run/docker.sock:/var/run/docker.sock:ro

volumes:
  conf:
  vhost:
  html:
  certs:

# Do not forget to 'docker network create nginx-proxy' before launch, and to add '--network nginx-proxy' to proxyed containers. 

networks:
  default:
    external:
      name: nginx-proxy

I get the nginx.tmpl file (the exact version I'm using right now is this one), create the network nginx-proxy, then I'm good to go.

You can use volumes_from: if you switch back to version: '2'

@buchdag
Copy link
Member

buchdag commented Jun 15, 2017

Just tested it again on a fresh install of debian 8 and docker.

simp_le self-verification fails while on my ubuntu 16.x and 17.x servers it works ok. I'll do more test later and try to understand why.

verification by LE then proceeds ok, the certificate gets created and I can browse to my test app.

nginx-proxy-le | 2017-06-15 08:49:20,497:INFO:simp_le:1211: Generating new account key
nginx-proxy-le | 2017-06-15 08:49:22,055:INFO:requests.packages.urllib3.connectionpool:756: Starting new HTTPS connection (1): acme-v01.api.letsencrypt.org
nginx-proxy-le | 2017-06-15 08:49:22,882:INFO:requests.packages.urllib3.connectionpool:756: Starting new HTTPS connection (1): letsencrypt.org
nginx-proxy-le | 2017-06-15 08:49:23,893:INFO:requests.packages.urllib3.connectionpool:207: Starting new HTTP connection (1): test.abcde.com
nginx-proxy-le | 2017-06-15 08:49:23,897:WARNING:simp_le:1303: test.abcde.com was not successfully self-verified. CA is likely to fail as well!
nginx-proxy-le | 2017-06-15 08:49:24,244:INFO:simp_le:1313: Generating new certificate private key
nginx-proxy    | test.abcde.com 66.133.109.36 - - [15/Jun/2017:08:49:24 +0000] "GET /.well-known/acme-challenge/GXIeYCfbB25DnA2kB8r9HlvW0fFPLl3tgh2_axZcGho HTTP/1.1" 200 87 "-" "Mozilla/5.0 (compatible; Let's Encrypt validation server; +https://www.letsencrypt.org)"
nginx-proxy-le | 2017-06-15 08:49:26,602:INFO:simp_le:391: Saving account_key.json
nginx-proxy-le | 2017-06-15 08:49:26,603:INFO:simp_le:391: Saving key.pem
nginx-proxy-le | 2017-06-15 08:49:26,604:INFO:simp_le:391: Saving chain.pem
nginx-proxy-le | 2017-06-15 08:49:26,604:INFO:simp_le:391: Saving fullchain.pem
nginx-proxy-le | 2017-06-15 08:49:26,605:INFO:simp_le:391: Saving cert.pem
nginx-proxy-le | Reloading nginx docker-gen (using separate container nginx-proxy-gen)...
test.abcde.com | 2017/06/15 08:49:23 [error] 5#5: *1 open() "/usr/share/nginx/html/.well-known/acme-challenge/GXIeYCfbB25DnA2kB8r9HlvW0fFPLl3tgh2_axZcGho" failed (2: No such file or directory), client: 172.18.0.4, server: localhost, request: "GET /.well-known/acme-challenge/GXIeYCfbB25DnA2kB8r9HlvW0fFPLl3tgh2_axZcGho HTTP/1.1", host: "test.abcde.com"
test.abcde.com | 172.18.0.4 - - [15/Jun/2017:08:49:23 +0000] "GET /.well-known/acme-challenge/GXIeYCfbB25DnA2kB8r9HlvW0fFPLl3tgh2_axZcGho HTTP/1.1" 404 169 "-" "python-requests/2.8.1" "-"
test.abcde.com | 172.18.0.2 - - [15/Jun/2017:08:50:22 +0000] "GET / HTTP/1.1" 200 324 "-" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10.11; rv:54.0) Gecko/20100101 Firefox/54.0" "123.123.123.123"
duch@some-vps:~$ curl https://test.abcde.com
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">

<html lang="en">
    <head>
        <meta http-equiv="content-type" content="text/html; charset=utf-8">
        <title>Title Goes Here</title>
    </head>

    <body>
        <p>Test nginx-proxy</p>
    </body>
</html> 

@buchdag
Copy link
Member

buchdag commented Jun 15, 2017

I just ran into a CA authorization error while performing additional tests. After trashing all the four docker volumes I use in the docker-compose file (conf, vhost, html and certs), CA authorization started working again.

@helderco
Copy link

helderco commented Jun 16, 2017

Could it be that the CA authorization doesn't follow redirects? It's trying to access the http, but it has a permanent redirect to https.

This could explain why it works the first time, but not on renewal. And why it works when we paste the URI into the browser. It's giving timeouts, when trying to access.

I'm going to try to change the template to add the location to the redirects.

@helderco
Copy link

Still giving timeout without the redirects.

@helderco
Copy link

Could it be an issue with ipv6?

@buchdag
Copy link
Member

buchdag commented Jun 16, 2017

I don't know the internals but I think LE will always prefer validation through IPv4 if available. I have both IPv4 and IPv6 configured on my hosts and DNS for proxyed services resolve to both addresses but I never saw an IPv6 request from a LE server on any of them.

I would advise against modifying the template as it would make further issues even harder to troubleshoot for you.

Renewals do work perfectly fine on my already setup proxy stacks with the vanille nginx.tmpl, so I still think you have a configuration file somewhere that prevents CA validation, either one of your own or a container generated one that's stuck in a bad state.

Reverting your stack configuration to something closer to base configuration would give you a clean start. More specifically, change from :

    volumes:
      - '/root/nginx/certs:/etc/nginx/certs'
      - '/root/nginx/conf.d:/etc/nginx/conf.d'
      - '/root/nginx/vhost.d:/etc/nginx/vhost.d'
      - '/root/docker-gen/nginx.tmpl:/app/nginx.tmpl:ro'
      - '/var/run/docker.sock:/tmp/docker.sock:ro'
      - '/apps/letsencrypt:/usr/share/nginx/html'
      - '/apps:/var/www'
      - '/root/nginx/htpasswd:/etc/nginx/htpasswd'

to something like

    volumes:
      - certs:/etc/nginx/certs
      - conf:/etc/nginx/conf.d
      - vhost:/etc/nginx/vhost.d
      - /root/docker-gen/nginx.tmpl:/app/nginx.tmpl:ro
      - /var/run/docker.sock:/tmp/docker.sock:ro
      - html:/usr/share/nginx/html

Again I'm not familiar with docker cloud, my idea is to use freshly created named volume to revert all configuration dir/files to base state, check if that gets CA validation to work again, and if it does try adding your own custom configuration files one by one like this:

      - conf:/etc/nginx/conf.d
      - /root/nginx/conf.d/somefile.conf:/etc/nginx/conf.d/somefile.conf

until you find which one prevents CA validation.

Also, are you sure your proxied containers are configured properly ? Better check that too.

@helderco
Copy link

helderco commented Jun 16, 2017

It was indeed the IPV6!

The CA authorization chose ipv6 over ipv4. I remember docker having some issues with IPV6, which will need some further testing in my setup.

After removing the AAAA record for my domain, the conection went successfully through IPV4 and the certificate got renewed.

@buchdag
Copy link
Member

buchdag commented Jun 16, 2017

I think at some point we might have to add a troubleshooting guide to this container.

Do you have any insight on why LE chose IPv6 over IPv4 to reach your domain for validation ? And on why did the nginx container failed to answer properly to the request made over IPv6 ?

@helderco
Copy link

Docker has support only for IPv4 by default, I'll probably just need to enable the dual stack in the daemon at this point. I'll try it later when I can.

It makes sense that LE would prefer IPv6 since it's the future. We should be pushing everyone to it as much as possible.

@buchdag
Copy link
Member

buchdag commented Jun 16, 2017

I did not enable the dual stack in docker daemon and yet my proxyed services and the ACME challenges are reachable both through IPv4 and IPv6. This might be related to the fact that the containers are connected to a user created bridge network (the nginx-proxy network in my docker-compose file), not to docker's default bridge network.

@helderco
Copy link

Did you have to do anything for IPv6 or does it work by default?

@buchdag
Copy link
Member

buchdag commented Jun 16, 2017

I did not configure anything specific on the docker side, the only IPv6 related config I did on each host was setting up the correct addresses (static) on the real public-facing interfaces.

The command I use to create my docker network is the following:

docker network create -o com.docker.network.bridge.name=nginx-proxy nginx-proxy

the com.docker.network.bridge.name is only there to provide a clean kernel name to the gateway interface.

The results of ifconfig nginx-proxy then looks like this:

nginx-proxy: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1500
        inet 172.xxx.xxx.xxx  netmask 255.255.0.0  broadcast 0.0.0.0
        inet6 fe80::xx:xxx:xxxx:xxxx  prefixlen 64  scopeid 0x20<link>
        ether 02:xx:xx:xx:xx:xx  txqueuelen 0  (Ethernet)
        RX packets X  bytes X (X MB)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets X  bytes X (X MB)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0

Local IPv4 address, link local IPv6 and that's it.

Edit: by the way you did put your finger on something else. When a certificate is present, no matter if it is valid or not, the nginx.tmpl will add a 302 redirect to https.

That mean that if, for a reason or another, one of your certificate expire, you won't be able to renew it without deleting the old one first as the CA validation will be redirected to https with an expired certificate and will fail.

@cpu
Copy link

cpu commented Jun 16, 2017

Could it be that the CA authorization doesn't follow redirects? It's trying to access the http, but it has a permanent redirect to https.

The Let's Encrypt CA, Boulder, does follow redirects on HTTP-01 challenges (up to a limit of 10).

I don't know the internals but I think LE will always prefer validation through IPv4 if available.

That was true historically, but changed recently.

Do you have any insight on why LE chose IPv6 over IPv4 to reach your domain for validation

The presence of an AAAA record for the domain will be used to infer that IPv6 should be attempted first.

That mean that if, for a reason or another, one of your certificate expire, you won't be able to renew it without deleting the old one first as the CA validation will be redirected to https with an expired certificate and will fail.

If an HTTP-01 challenge request received on 80 gets redirected to port 443 Boulder will ignore certificate errors to prevent this sort of configuration from breaking validation. It should be OK if I'm understanding correctly (my docker-fu is extremely weak).

Hope these clarifications were helpful!

@buchdag
Copy link
Member

buchdag commented Jun 16, 2017

They were extremely helpful, thank you @cpu !

@helderco
Copy link

For anyone getting here in need of troubleshooting, here's how you know if the problem is IPv6.

If you run the container with debug on (DEBUG=true), you can see why a validation failed. You'll see something like this in the logs:

{
  "type": "http-01",
  "status": "invalid",
  "error": {
    "type": "urn:acme:error:connection",
    "detail": "Fetching http://example.org/.well-known/acme-challenge/GoXuMZ3iUg_-fowOA51_RlN8tiXHWqeCjbKMnR5C9T4: Timeout",
    "status": 400
  },
  "uri": "https://acme-v01.api.letsencrypt.org/acme/challenge/8Fo11ildCGTTidlkONH1Ib6xr_-rOTrlk3dE22D1t6o/1351010203",
  "token": "GoXuMZ3iUg_-fowOA51_RlN9tiXHWqeCjbKMnR5C9X4",
  "keyAuthorization": "GoXuMZ3iUg_-fowOA51_RlN9tiXHWqeCjbKMnR5C9X4.Ev_AZ-22qa96Oz2eHtD2vI8hwC_U5JPd0cFHOpCxg6E",
  "validationRecord": [
    {
      "url": "http://example.org/.well-known/acme-challenge/GoXuMZ3iUg_-fowOA51_RlN9tiXHWqeCjbKMnR5C9X4",
      "hostname": "example.org",
      "port": "80",
      "addressesResolved": [
        "189.162.145.42",
        "2a43:a0b1:1:a1::65:5101"
      ],
      "addressUsed": "2a43:a0b1:1:a1::65:5101",
      "addressesTried": []
    }
  ]
},

Notice that addressUsed is using the IPv6 address.

If you have a AAAA DNS record, make sure the address is reachable with a tester such as http://ipv6-test.com/validate.php.

@sburnicki
Copy link

I had a smilar issue and fixed it by downgrading to jrcs/letsencrypt-nginx-proxy-companion:v1.4 as mentioned here

@maxkueng
Copy link

I had a similar issue but it was not IPv6 related. After many hours of trying to resolve it I came to the following solution:

LetsEncrypt kept receiving "503 Service Temporarily Unavailable" as a response to the acme-challenge and I got the same message in the browser. The trick to resolve this was to remove -only-exposed in the arguments passed to the jwilder/docker-gen container. I don't know why I had this in there as it's not in any recent version of the REAME. I must have copied it from an old one.
It seems that with -only-exposed docker-gen couldn't find the application container and the upstream example.com {} section in the Nginx config was empty.

After resolving this the response to the acme-challenges were now "403 Forbidden" errors from Nginx instead. The files in .well-known/acme-challenge were actually present and when I docker exec nginx bash I was able to access and read the files. But somehow Nginx wasn't able to serve them. So I had to change the permissions of the directory mounted to /usr/share/nginx/html from 0750 to 0755 and then Nginx was able to serve the challenge files and the certificates were able to renew.

I think it's really weird that I had to make the directory readable for "other" users as the directory and all files are owned by root:root and docker is running as root, and Nginx inside the container is also running as root. So shouldn't it even work if the permissions were just 0700?

But I had the same problem with some static sites that I'm running by just starting an nginx container and mounting. The files are readable from inside the container but Nginx won't serve them.

@obaydmir
Copy link

obaydmir commented Jun 27, 2017

I removed AAAA record from my domain records and now I'm getting certificates from Let's Encrypt. A month ago I had another server on which I had set AAAA record and I got my certificates from Let's Encrypt. Their is some random magic happening.

But at this moment I still get this error messages, before eventually I receive the certificates:

ERROR:acme.challenges:311: Unable to reach http://my.domain.com/.well-known/acme-challenge/9JHkK8UBzk6G2QKNmz5UZLtABSXouI0ATPsbD8LbXLo: HTTPConnectionPool(host='my.domain.com', port=80): Max retries exceeded with url: /.well-known/acme-challenge/9JHkK8UBzk6G2QKNmz5UZLtABSXouI0ATPsbD8LbXLo (Caused by NewConnectionError('<requests.packages.urllib3.connection.HTTPConnection object at 0x7f473870f850>: Failed to establish a new connection: [Errno 110] Operation timed out',))
WARNING:simp_le:1303: my.domain.com was not successfully self-verified. CA is likely to fail as well!

@cpu
Copy link

cpu commented Jun 27, 2017

A month ago I had another server on which I had set AAAA record and I got my certificates from Let's Encrypt. Their is some random magic happening.

The Let's Encrypt validation server was changed to prefer IPv6 for dual-homed hosts just over one month ago: https://community.letsencrypt.org/t/preferring-ipv6-for-challenge-validation-of-dual-homed-hosts/34774 No magic in this case 🐰 🎩 ✨

@buchdag
Copy link
Member

buchdag commented Jan 9, 2019

Closing issue due to inactivity.

@buchdag buchdag closed this as completed Jan 9, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/failing-authorization Issue concerning failing ACME challenge
Projects
None yet
Development

No branches or pull requests

10 participants