[Python-il] Apache with mod WSGI (for django) crashes when you "import nltk"

gritchie at gmail.com gritchie at gmail.com
Mon Oct 14 20:26:58 IST 2013


Hi Vish - Did you manage to get this working somehow? I am also having 
problems using the NLTK in a django app. I'd appreciate any tips you have 
on setting things up correctly.

Thanks,

Graham

On Friday, February 1, 2013 10:49:00 AM UTC, Avishalom Shalit wrote:
>
> thanks. 
> actually this is an internal app, only available on our VPN, 
> so security is not an issue , 
> and i only expect a maximum of 4 users 
>
> i will look at the other setups. 
> thanks
>
> -- vish
>
>
>
> On 31 January 2013 23:48, Emanuel Ilyayev <emi... at gmail.com <javascript:>>wrote:
>
>> I don't know enough NLTK but I work with django :)
>>
>> From Asaf's description it looks like you have to change your 
>> architecture. Apache - in it's default configuration - is not efficient in 
>> working with heavy processes because it creates a new process for each 
>> request. There are better setups like using gUnicorn or uWSGI that load n 
>> workers and distribute the work between them (usually n = number of cores X 
>> 2 + 1).
>>
>> More robust and scalable setup would include a separate workers that 
>> answer to the NLTK requests asynchronously and django approaches these 
>> workers via a message queue. This setup will allow you to put your NLTK 
>> workers even on a separate machine without creating situation where your 
>> web server is competing with your NLTK workers on limited resources (CPU 
>> and RAM).
>>
>> Even if you will eventually find the way to configure apache to load NLTK 
>> without crashing - the URL that handles NLTK requests would be a perfect 
>> point to attack you server and to bring it into a DOS (denial of service) 
>> situation using only a couple of strong machines approaching this URL....
>>
>> I urge you to read a little bit about gEvent and Celery to understand 
>> what I'm talking about.
>>
>> HTH
>>
>> --
>> Emanuel
>>
>>
>>
>>
>> On Thu, Jan 31, 2013 at 7:30 PM, asaf greenberg <asafgr... at gmail.com<javascript:>
>> > wrote:
>>
>>>  
>>> i don't know enough django, but i worked with nltk.
>>> NLTK is a very heavy module, lagging on import is expected, especially 
>>> if you're using certain modules.
>>>
>>> AFAIK you should `import' it only once, on server (re)start, and it 
>>> costs about 10-30 secs (did you optimize with *pyc or *pyo?). unless you're 
>>> short on RAM... but i hope that's not the case.
>>>
>>> NLTK has also many sub-modules, which can and should be disabled, for 
>>> performance.
>>>
>>> Does it hang elsewhere (apart from server startup)?
>>> does it have a longer delay than 20-30 secs.?
>>>
>>>
>>>
>>> On 1/31/2013 6:44 PM, Avishalom Shalit wrote:
>>>  
>>>    As title. 
>>>
>>>  It just silently hangs. 
>>>
>>>  as far as i found on google, other people have ran into it, 
>>>  but nobody posted a solution. 
>>>
>>>  anybody overcame this before ?
>>>
>>>  thanks
>>>    
>>>
>>>  -- vish
>>>
>>>      
>>>  
>>> _______________________________________________
>>> Python-il mailing listPyth... at hamakor.org.il <javascript:>http://hamakor.org.il/cgi-bin/mailman/listinfo/python-il
>>>
>>>  
>>>  
>>> _______________________________________________
>>> Python-il mailing list
>>> Pyth... at hamakor.org.il <javascript:>
>>> http://hamakor.org.il/cgi-bin/mailman/listinfo/python-il
>>>
>>>
>>
>> _______________________________________________
>> Python-il mailing list
>> Pyth... at hamakor.org.il <javascript:>
>> http://hamakor.org.il/cgi-bin/mailman/listinfo/python-il
>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://hamakor.org.il/pipermail/python-il/attachments/20131014/74fd2b84/attachment-0001.html>


More information about the Python-il mailing list