/** * 리트윗 데이터 재수집 스크립트 * 잘못 저장된 리트윗 일정을 Nitter에서 다시 가져와 수정합니다. * * 사용법: node scripts/refetch-retweets.js [scheduleId1,scheduleId2,...] */ import mysql from 'mysql2/promise'; import { fetchSingleTweet, extractTitle } from '../src/services/x/scraper.js'; const NITTER_URL = process.env.NITTER_URL || 'http://nitter:8080'; const pool = mysql.createPool({ host: process.env.DB_HOST || 'mariadb', port: parseInt(process.env.DB_PORT || '3306'), user: process.env.DB_USER || 'fromis9', password: process.env.DB_PASSWORD || 'fromis9', database: process.env.DB_NAME || 'fromis9', }); async function main() { // CLI에서 특정 ID 지정 가능 const argIds = process.argv[2]?.split(',').map(Number).filter(Boolean); let rows; if (argIds && argIds.length > 0) { [rows] = await pool.query( `SELECT sx.schedule_id, sx.post_id, sx.username, sx.content FROM schedule_x sx WHERE sx.schedule_id IN (?)`, [argIds] ); } else { [rows] = await pool.query( `SELECT sx.schedule_id, sx.post_id, sx.username, sx.content FROM schedule_x sx WHERE sx.content LIKE 'RT @%' OR sx.content LIKE '%nitter%t.co%'` ); } console.log(`대상: ${rows.length}건`); if (rows.length === 0) { await pool.end(); return; } let updated = 0; let failed = 0; for (const row of rows) { try { // RT @username: 에서 원본 작성자 추출 const rtMatch = row.content?.match(/^RT @(\w+):/); const fetchUsername = rtMatch ? rtMatch[1] : (row.username || 'realfromis_9'); console.log(`[${row.schedule_id}] post_id=${row.post_id}, from=@${fetchUsername}`); const tweet = await fetchSingleTweet(NITTER_URL, fetchUsername, row.post_id); // RT @ 프리픽스 제거 let newContent = tweet.text; const rtPrefixMatch = newContent.match(/^RT @\w+:\s*/); if (rtPrefixMatch) { newContent = newContent.slice(rtPrefixMatch[0].length); } // 끝의 … 제거 newContent = newContent.replace(/…$/, '').trim(); const newTitle = extractTitle(newContent); const newImageUrls = tweet.imageUrls.length > 0 ? JSON.stringify(tweet.imageUrls) : null; // DB 업데이트 await pool.query('UPDATE schedules SET title = ? WHERE id = ?', [newTitle, row.schedule_id]); await pool.query( 'UPDATE schedule_x SET username = ?, content = ?, image_urls = ? WHERE schedule_id = ?', [fetchUsername, newContent, newImageUrls, row.schedule_id] ); console.log(` -> title: ${newTitle.substring(0, 60)} | images: ${tweet.imageUrls.length}`); updated++; // Nitter 부하 방지 await new Promise(r => setTimeout(r, 500)); } catch (err) { console.error(` -> 실패: ${err.message}`); failed++; } } console.log(`\n완료: ${updated}건 수정, ${failed}건 실패`); await pool.end(); } main().catch(err => { console.error(err); process.exit(1); });