Finding the real ip: Cloudfront -> ELB -> Nginx -> Rails

Finding the real ip: Cloudfront -> ELB -> Nginx -> Rails

There are several good reasons to put Cloudfront in front of your load balancers, as exposed here

Performance

  • Cacheable Content.
  • Global Reach.
  • Persistent Connections.
  • Collapsed Forwarding.

Security

  • Distributed Denial of Service (DDOS)
  • Encryption in Transit
  • Web Application Firewall (WAF).

Keep in mind there is an additional fee for using Cloudfront, but the traffic inside the aws network has no costs.

image.png

Then, of course we need to load balance our EC2 instances for high availability. Then the requests are handled by nginx and passed via proxy_pass to Puma (rails server).

That's a lot of jumps to serve a request!

I wanted to log rails requests with the real user IP address, but with all these jumps, it's not a trivial thing.

What you need to do is forward to the upstream the X-Forwarded-For header, which comes all the way up from Cloudfront, passing through ELB to finally be passed to Nginx

upstream your_app {
  server unix:/home/ubuntu/www/shared/tmp/sockets/puma.sock fail_timeout=0;
}
server {
    listen 80 default_server;
    server_name yourapp.com;
    gzip on;

     location / {
        proxy_set_header X-Forwarded-Proto https;
        proxy_set_header X-Real-IP $remote_addr;
        proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for; # <- this header !
        proxy_set_header Host $http_host;
        proxy_redirect off;
        proxy_pass http://your_app;
    }
}

That header comes in the form of a comma separated string, so for every jump, an ip is appended. For this reason, the first ip "might be" the real user ip. (unless someone manually injects that header in the request)

Having this, we can now use it in Rails. I changed this in production.rb

-  config.log_tags = %i[request_id remote_ip]
+  config.log_tags = [:request_id, ->(request) { request.forwarded_for&.first || request.remote_ip}]

The interesting part is adding this lambda to the log_tags

->(request) { request.forwarded_for&.first || request.remote_ip }

Where forwarded_for is defined in the Rack request class:

def forwarded_for
  if value = get_header(HTTP_X_FORWARDED_FOR)
    split_header(value).map do |authority|
      split_authority(wrap_ipv6(authority))[1]
    end
  end
end

We try to find the first ip in the array, which as mentioned earlier, would be the first jump (real user ip). We use safe navigator in case the header is not present, and we fallback to remote_ip in case we could not find a forwarded ip.

I, [2020-12-22T19:46:31.273143 #39407]  INFO -- : [dfbe8700-908a-4405-bb0c-a23ac5359a5f] [190.8.117.22] Started GET "/" for 64.252.68.65 at 2020-12-22 19:46:31 +0000

the first ip in this log is the real ip of the user (190.8.117.22), while the last ip (64.252.68.65) is the IP of the Cloudfront edge location.