AI summary 2 แหล่ง · 6 วันก่อน

AI เรียนรู้การให้เหตุผลเชิงพื้นที่ — ทำให้สร้างวิดีโอและแอนิเมชันได้แม่นยำขึ้น

นักวิจัยเพิ่งเปิดตัว framework และ benchmark ใหม่สำหรับให้ AI เข้าใจการให้เหตุผลเชิงพื้นที่ (spatial reasoning) — ซึ่งช่วยให้ LLM สร้างโค้ดวิดีโอและแอนิเมชันที่ถูกต้องทางเรขาคณิตได้ดีขึ้น แทนที่จะพึ่งพิกเซล diffusion ที่มักเกิด overlap หรือ misalignment ทีม Hugging Face ก็ปล่อย fine-tuning guide สำหรับ NVIDIA Cosmos ที่ใช้ LoRA/DoRA ให้ dev ลองทำได้เลย

แหล่งข่าว

ประเด็น

6 วันก่อน

อัปเดต

PRISM benchmark มี 10,372 instruction-code pairs สำหรับประเมิน spatial-temporal reasoning ของ LLM — ใหญ่กว่า benchmark เก่า 20 เท่า
Interaction locality framework วัดว่า information flow ของ AI ติดอยู่ในพื้นที่เล็ก ๆ หรือข้ามไปไกล — ใช้ sparse autoencoder ablations และ activation patching
Educational animation generation ต้องแก้ปัญหา render defects (overlap, misalignment, broken continuity) ที่ไม่เห็นจากโค้ด แต่เห็นหลังรัน

แหล่งต้นทาง · 6

ลิงก์ต้นทางอยู่ครบ เพื่อให้เปิดอ่านเต็มและเทียบข้อมูลเองได้

Hugging Face Blog 6 วันก่อน

Profiling in PyTorch (Part 1): A Beginner's Guide to torch.profiler

arXiv — cs.AI 22 พ.ค.

Interaction Locality in Hierarchical Recursive Reasoning

arXiv — cs.AI 20 พ.ค.

PRISM: A Benchmark for Programmatic Spatial-Temporal Reasoning

Hugging Face Blog 18 พ.ค.

Fine-Tuning NVIDIA Cosmos Predict 2.5 with LoRA/DoRA for Robot Video Generation

arXiv — cs.AI 18 พ.ค.

See Before You Code: Learning Visual Priors for Spatially Aware Educational Animation Generation