We simply wrapped up the June ’24 AI, Machine Learning and Computer Vision Meetup, and in the event you missed it or wish to revisit it, right here’s a recap! On this weblog publish you’ll discover the playback recordings, highlights from the displays and Q&A, in addition to the upcoming Meetup schedule so that you could be part of us at a future occasion.
In lieu of swag, we gave Meetup attendees the chance to assist information a $200 donation to charitable causes. The charity that obtained the very best variety of votes this month was Heart to Heart International, a company that ensures high quality care is offered equitably in medically under-resourced communities and in catastrophe conditions. We’re sending this occasion’s charitable donation of $200 to Coronary heart to Coronary heart Worldwide on behalf of the Meetup members!
Missed the Meetup? No drawback. Listed here are playbacks and speak abstracts from the occasion.
Textual content-to-image diffusion fashions exhibit outstanding modifying capabilities within the picture area, particularly after Latent Diffusion Fashions made diffusion fashions extra scalable. Conversely, video modifying nonetheless has a lot room for enchancment, notably given the relative shortage of video datasets in comparison with picture datasets. Subsequently, we’ll focus on whether or not pre-trained text-to-image diffusion fashions can be utilized for zero-shot video modifying with none fine-tuning stage. Lastly, we will even discover attainable future work and attention-grabbing analysis concepts within the subject.
Speaker: Bariscan Kurtkaya is a KUIS AI Fellow and a graduate pupil within the Division of Laptop Science at Koc College. His analysis pursuits lie in exploring and leveraging the capabilities of generative fashions within the realm of 2D and 3D information, encompassing scientific observations from house telescopes.
- May this be utilized to few shot or zero shot studying? Specifically, may paraphrasing the article description be utilized by the mannequin to detect objects not current within the coaching dataset?
- Are Lineart and Softedge is edge filters?
Imaginative and prescient-and-language fashions which are skilled to affiliate photographs with textual content have proven to be efficient for a lot of duties, together with object detection and picture segmentation. On this speak, we’ll focus on tips on how to improve vision-and-language fashions’ means to localize objects in photographs by fine-tuning them for self-consistent visible explanations. We suggest a way that augments text-image datasets with paraphrases utilizing a big language mannequin and employs SelfEQ, a weakly-supervised technique that promotes self-consistency in visible clarification maps. This strategy broadens the mannequin’s working vocabulary and improves object localization accuracy, as demonstrated by efficiency positive aspects on aggressive benchmarks.
Speaker: Dr. Paola Cascante-Bonilla obtained her Ph.D. in Laptop Science at Rice College in 2024, suggested by Professor Vicente Ordóñez Román, engaged on Laptop Imaginative and prescient, Pure Language Processing, and Machine Studying. She obtained a Grasp of Laptop Science on the College of Virginia and a B.S. in Engineering on the Tecnológico de Costa Rica. Paola will be part of Stony Brook College (SUNY) as an Assistant Professor within the Division of Laptop Science. Ruozhen (Catherine) He is a first-year Laptop Science PhD pupil at Rice College, suggested by Prof. Vicente Ordóñez, specializing in environment friendly algorithms in laptop imaginative and prescient with much less or multimodal supervision. She goals to leverage insights from neuroscience and cognitive psychology to develop interpretable algorithms that obtain human-level intelligence throughout versatile duties.
- May you elaborate on how the mannequin ensures robustness when confronted with variations in textual descriptions, corresponding to ‘a prepare’ versus ‘a choo choo’, and the way the mathematical constraints included via 𝐿𝑐𝑠𝑡 contribute to sustaining consistency between the generated photographs and the unique inputs?
- Prepare-from-scratch is dear, so have you ever tried utilizing your strategy for information augmentation at train-time (in a train-from scratch setting)?
- The outcomes are actually spectacular, however I ponder if they’re a mirrored image of biases within the base mannequin. Have you ever tried evaluating your strategy to a Combination-of-Specialists strategy?”
- Video is usually a nice supply of knowledge for consistency. How scalable is your immediate era strategy? May it work for producing pseudo-labels from video?
- To what degree of abstraction is your mannequin sturdy? For instance, if I’m saying that I wish to determine a “tower-like” tree, will it be capable of predict that with a excessive accuracy?
Datasets and Fashions are the 2 pillars of recent machine studying, however connecting the 2 might be cumbersome and time-consuming. On this lightning speak, you’ll find out how the seamless integration between Hugging Face and FiftyOne simplifies this complexity, enabling more practical data-model co-development. By the tip of the speak, it is possible for you to to obtain and visualize datasets from the Hugging Face hub with FiftyOne, apply state-of-the-art transformer fashions on to your information, and effortlessly share your datasets with others.
Speaker: Jacob Marks, PhD is a Machine Studying Engineer and Developer Evangelist at Voxel51, the place he leads open supply efforts in vector search, semantic search, and generative AI for the FiftyOne data-centric AI toolkit. Previous to becoming a member of Voxel51, Jacob labored at Google X, Samsung Analysis, and Wolfram Analysis.
- Does FiftyOne use the Hugging Face dataset cache?
- For zero-shot or caption era, detection, or classification, if we wish to use a customized dataset after reviewing it via the Voxel51 UI, can we take away or preprocess the defective information straight from the UI?
The objective of the Meetups is to deliver collectively communities of knowledge scientists, machine studying engineers, and open supply lovers who wish to share and increase their data of AI and complementary applied sciences.
Be part of one of many 12 Meetup areas closest to your timezone.
Up subsequent on July third, 2024 at 2:00 PM BST and 6:30 PM IST, now we have three nice audio system lined up!
- Efficiency Optimisation for Multimodal LLMs- Neha Sharma, Technical PM at Ori Industries
- 5 Useful Methods to Use Embeddings, the Swiss Military Knife of AI- Harpreet Sahota, Hacker-in-residence at Voxel51
- Deep Dive: Accountable and Unbiased GenAI for Laptop Imaginative and prescient — Daniel Gural — ML Engineer at Voxel51
Register for the Zoom here. You will discover a whole schedule of upcoming Meetups on the Voxel51 Events page.
There are a number of methods to become involved within the Laptop Imaginative and prescient Meetups. Attain out in the event you determine with any of those:
- You’d like to talk at an upcoming Meetup
- You’ve got a bodily assembly house in one of many Meetup areas and want to make it out there for a Meetup
- You’d wish to co-organize a Meetup
- You’d wish to co-sponsor a Meetup
Attain out to Meetup co-organizer Jimmy Guerrero on Meetup.com or ping me over LinkedIn to debate tips on how to get you plugged in.
—
These Meetups are sponsored by Voxel51, the corporate behind the open supply FiftyOne laptop imaginative and prescient toolset. FiftyOne permits information science groups to enhance the efficiency of their laptop imaginative and prescient fashions by serving to them curate top quality datasets, consider fashions, discover errors, visualize embeddings, and get to manufacturing sooner. It’s simple to get started, in just some minutes.