this post was submitted on 15 Jun 2024
78 points (91.5% liked)

Python

6347 readers
3 users here now

Welcome to the Python community on the programming.dev Lemmy instance!

πŸ“… Events

PastNovember 2023

October 2023

July 2023

August 2023

September 2023

🐍 Python project:
πŸ’“ Python Community:
✨ Python Ecosystem:
🌌 Fediverse
Communities
Projects
Feeds

founded 1 year ago
MODERATORS
you are viewing a single comment's thread
view the rest of the comments
[–] hark@lemmy.world 4 points 4 months ago (1 children)

That depends on what the application needs to do. There's a reason why all performance-critical libraries for Python aren't written in Python.

[–] sugar_in_your_tea@sh.itjust.works 3 points 4 months ago (1 children)

Sure, and we use those, like numpy, scipy, and tensorflow. Python is best when gluing libraries together, so the more you can get out of those libraries, the better.

Python isn't fast, but it's usually fast enough to shuffle data from one library to the next.

[–] hark@lemmy.world 4 points 4 months ago (1 children)

Usually, but when it isn't then you've got a bottleneck. Multithreaded performance is a major weak point if you need to do any processing that isn't handled by one of the libraries.

[–] sugar_in_your_tea@sh.itjust.works 2 points 4 months ago* (last edited 4 months ago) (1 children)

Then you need to break up your problem into processes. Python doesn't really do multi-threading (hopefully that changes with the GIL going away), but most things can scale reasonably well in a process pool if you manage the worker queue properly (e.g. RabbitMQ works well).

It's not as good as proper threadimg, but it's a lot simpler and easier to scale horizontally. You can later rewrite certain parts if hosting costs become a larger issue than dev costs.

[–] hark@lemmy.world 3 points 4 months ago (1 children)

A process pool means extra copying of data around which incurs a huge cost and this is made worse by the tendency for parallel-processing-friendly workloads often consisting of large amounts of data.

Yup, which is why you should try to limit the copying by designing your parallel processing algorithm around it. If you can't, you would handle threading with a native library or something and scale vertical instead of horizontal. Or pick a different language if it's a huge part of your app.

But in a lot of cases, it's reasonable to stick with Python and scale horizontally. That has value if you're otherwise a Python shop.