[Python-il] Apache with mod WSGI (for django) crashes when you "import nltk"

Avishalom Shalit avishalom at gmail.com
Fri Feb 1 12:49:00 IST 2013


thanks.
actually this is an internal app, only available on our VPN,
so security is not an issue ,
and i only expect a maximum of 4 users

i will look at the other setups.
thanks

-- vish



On 31 January 2013 23:48, Emanuel Ilyayev <emikil at gmail.com> wrote:

> I don't know enough NLTK but I work with django :)
>
> From Asaf's description it looks like you have to change your
> architecture. Apache - in it's default configuration - is not efficient in
> working with heavy processes because it creates a new process for each
> request. There are better setups like using gUnicorn or uWSGI that load n
> workers and distribute the work between them (usually n = number of cores X
> 2 + 1).
>
> More robust and scalable setup would include a separate workers that
> answer to the NLTK requests asynchronously and django approaches these
> workers via a message queue. This setup will allow you to put your NLTK
> workers even on a separate machine without creating situation where your
> web server is competing with your NLTK workers on limited resources (CPU
> and RAM).
>
> Even if you will eventually find the way to configure apache to load NLTK
> without crashing - the URL that handles NLTK requests would be a perfect
> point to attack you server and to bring it into a DOS (denial of service)
> situation using only a couple of strong machines approaching this URL....
>
> I urge you to read a little bit about gEvent and Celery to understand what
> I'm talking about.
>
> HTH
>
> --
> Emanuel
>
>
>
>
> On Thu, Jan 31, 2013 at 7:30 PM, asaf greenberg <asafgreenberg at gmail.com>wrote:
>
>>
>> i don't know enough django, but i worked with nltk.
>> NLTK is a very heavy module, lagging on import is expected, especially if
>> you're using certain modules.
>>
>> AFAIK you should `import' it only once, on server (re)start, and it costs
>> about 10-30 secs (did you optimize with *pyc or *pyo?). unless you're short
>> on RAM... but i hope that's not the case.
>>
>> NLTK has also many sub-modules, which can and should be disabled, for
>> performance.
>>
>> Does it hang elsewhere (apart from server startup)?
>> does it have a longer delay than 20-30 secs.?
>>
>>
>>
>> On 1/31/2013 6:44 PM, Avishalom Shalit wrote:
>>
>>    As title.
>>
>>  It just silently hangs.
>>
>>  as far as i found on google, other people have ran into it,
>>  but nobody posted a solution.
>>
>>  anybody overcame this before ?
>>
>>  thanks
>>
>>
>>  -- vish
>>
>>
>>
>> _______________________________________________
>> Python-il mailing listPython-il at hamakor.org.ilhttp://hamakor.org.il/cgi-bin/mailman/listinfo/python-il
>>
>>
>>
>> _______________________________________________
>> Python-il mailing list
>> Python-il at hamakor.org.il
>> http://hamakor.org.il/cgi-bin/mailman/listinfo/python-il
>>
>>
>
> _______________________________________________
> Python-il mailing list
> Python-il at hamakor.org.il
> http://hamakor.org.il/cgi-bin/mailman/listinfo/python-il
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://hamakor.org.il/pipermail/python-il/attachments/20130201/9334c6d9/attachment.htm>


More information about the Python-il mailing list