Skip to content

Conversation

marianocrosetti
Copy link
Contributor

Description

Makes CoreNLPClient not check if the server is alive when start_server=StartServer.DONT_START

Fixes Issues

#1059

Unit test coverage

test_external_server renamed to test_external_server_available (and modified)
test_external_server_timeout added
test_external_server_unavailable added

pytest executed successfully on stanza/tests/server/test_client.py

Known breaking changes/behaviors

Now when start_server=StartServer.DONT_START the clients must be sure the server is running. Otherwise (if they launched a server instance but didn't wait for enough) they could get a connection error.

…rver.DONT_START

Makes CoreNLPClient not checking if the server is alive when start_server=StartServer.DONT_START
raise TimeoutException(r.text)
else:
raise AnnotationException(r.text)
except requests.exceptions.Timeout as e:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good catch!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks
I also take off the:
if r.text == "CoreNLP request timed out. Your document may be too long.":
(from below) because it has probably not sense to check r.text after a HTTP exception (probably it will not have anything)

""".strip()

class HTTPMockServerTimeoutContext:
""" For lunching an HTTP server on certain port with an specified delay at responses """
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

^launching

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry! The spellchecker didn't notice (maybe) because lunching exists. Funny
I corrected with a second commit


class HTTPMockServerTimeoutContext:
""" For lunching an HTTP server on certain port with an specified delay at responses """
def __init__(self, port, timeout_secs):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i kind of want this to find an open port... but at the same time i recognize there are a lot of tests which already use 9001 in exactly the same manner

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it ok if we leave this class as it is (with the parametrized port) and delegate the responsibility of setting the port to the caller?

[Text=. CharacterOffsetBegin=66 CharacterOffsetEnd=67 PartOfSpeech=.]
""".strip()

class HTTPMockServerTimeoutContext:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like this idea a lot. Thanks!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, many people also use flask or pytest-httpserver for mocking HTTP servers. Later could be a better option but I think this doesn't introduce any extra dependency (as http.server is installed by default)

@AngledLuffa
Copy link
Collaborator

one minor change requested, plus if you can think of a way around assuming 9001 is always open, that would be great. thanks!

@marianocrosetti
Copy link
Contributor Author

marianocrosetti commented Jun 27, 2022

one minor change requested, plus if you can think of a way around assuming 9001 is always open, that would be great. thanks!

When you bind a socket and give 0 as a port number, the OS will bind it to the first available unused port.

  • I can refactor HTTPMockServerTimeoutContext so enter returns this port number so we can do something like:
with HTTPMockServerTimeoutContext(0, 20) as port:
    # Here we use f'http://localhost:{port}'

(The binding process occurs at HTTPServer constructor called so it's an easy refactor)

  • In order to solve the problem for other test cases we can:
  1. Choose a free port doing something like:
import socket
sock = socket.socket()
sock.bind(('', 0))
sock.getsockname()[1]

Credits: SO

Notice that nothing the port remains free (it could happen a running condition, for example if tests are executed in parallalle and interfere each other). I suggest the number (2) solution.

  1. Execute the CoreNLPServer with port=0 and find the port choosed by scanning the ports used by the process
    Example of code:
import psutil

def get_port(pid):
    connections = psutil.net_connections()
    for con in connections:
        if con.pid == pid:
            if con.raddr != tuple():
                return con.raddr.port
            if con.laddr != tuple():
                return con.laddr.port
    return -1

corenlp_home = os.getenv('CORENLP_HOME')
start_cmd = 'python -m http.server 8000'
start_cmd = start_cmd and shlex.split(start_cmd)
p = subprocess.Popen(start_cmd)
print(p.pid)
time.sleep(5)
port=get_port(p.pid)

Credits: SO

I tested and on test_client.py and this solution works also for edu.stanford.nlp.pipeline.StanfordCoreNLPServer

Notice that the java edu.stanford.nlp.pipeline.StanfordCoreNLPServer ALWAYS use the port 9000 if not port is provided, so we need to provide "--port 0" (I tested and it works).

Tell me what you prefer. But I think it's better to only modify HTTPMockServerTimeoutContext on this PR. Then I can create other PR for fixing the other test cases if you need.

@AngledLuffa
Copy link
Collaborator

Since half the test uses 9001 anyway, it's not necessary to change any part of it to scan for open ports. It would be a nice TODO for later, though. I'll take another look when I get home and merge it

@AngledLuffa AngledLuffa merged commit c77f345 into stanfordnlp:dev Jun 28, 2022
@AngledLuffa
Copy link
Collaborator

Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants