Skip to content

Commit a402d48

Browse files
committed
[Frontend] Fix tcp port reservation for api server
PR vllm-project#8537 changed this code to bind the TCP port prior to the engine starting to ensure that port isn't used unexpectedly by something launched during the engine startup. (ray is mentioned in the discussino history and the comment in the code) This change introduced some new unexpected behavior as discussed in issue vllm-project#9737. Restarting vllm within a short time after handling an API request where the client hasn't closed its end of the connection would cause vllm to fail to start with a "port in use" error. The primary issue was the use of the `fd` option to the uvicorn config. This option does not actually do what we want. The relevant code can be found here: https://github.com/encode/uvicorn/blob/fe3910083e3990695bc19c2ef671dd447262ae18/uvicorn/config.py#L496-L501 The important line to note is this one: sock = socket.fromfd(self.fd, socket.AF_UNIX, socket.SOCK_STREAM) Note that `uvicorn` is expecting the fd to be for an `AF_UNIX` socket. We are passing in a TCP (`AF_INET`) socket. We seem to be lucky that this mostly works anyway, though it's surprising it works at all! Instead of using this `fd` option, set the `SO_REUSEADDR` option on the socket, which will allow `uvicorn` to bind to the same address and port. When reserving the port, we previously always specified `""` for the host. I fixed that too so that we bind to the host that we pass to `uvicorn` for it to bind to, as well. Finally, explicitly `close()` the socket as a final step of cleanup. Closes vllm-project#9737 Signed-off-by: Russell Bryant <[email protected]>
1 parent 04cef2c commit a402d48

File tree

1 file changed

+4
-2
lines changed

1 file changed

+4
-2
lines changed

vllm/entrypoints/openai/api_server.py

Lines changed: 4 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -569,7 +569,8 @@ async def run_server(args, **uvicorn_kwargs) -> None:
569569
# This avoids race conditions with ray.
570570
# see https://github.com/vllm-project/vllm/issues/8204
571571
sock = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
572-
sock.bind(("", args.port))
572+
sock.bind((args.host or "", args.port))
573+
sock.setsockopt(socket.SOL_SOCKET, socket.SO_REUSEADDR, 1)
573574

574575
def signal_handler(*_) -> None:
575576
# Interrupt server on sigterm while initializing
@@ -593,13 +594,14 @@ def signal_handler(*_) -> None:
593594
ssl_certfile=args.ssl_certfile,
594595
ssl_ca_certs=args.ssl_ca_certs,
595596
ssl_cert_reqs=args.ssl_cert_reqs,
596-
fd=sock.fileno(),
597597
**uvicorn_kwargs,
598598
)
599599

600600
# NB: Await server shutdown only after the backend context is exited
601601
await shutdown_task
602602

603+
sock.close()
604+
603605

604606
if __name__ == "__main__":
605607
# NOTE(simon):

0 commit comments

Comments
 (0)