feat(route): add southplus forum #22186
Conversation
|
Successfully generated as following: http://localhost:1200/south-plus/forum/8 - Failed ❌http://localhost:1200/ /south-plus/forum/128 - Failed ❌ |
Auto Review
|
330ba2d to
bdb5be8
Compare
- /south-plus/forum/:fid? for PHPWind-based forum - Cookie-based auth via SOUTHPLUS_COOKIE env var - Configurable UA via SOUTHPLUS_UA (cookie-version binding) - Full-text fetching with cache.tryGet - p-map concurrency control (3 at a time) - Category extraction from thread listing - Fid reference table and cookie guide in description
….trueUA as default - Replace hardcoded Firefox 151 UA with config.trueUA to comply with guideline 36 - Use built-in proxy.dispatcher (PROXY_URI) for proxy support instead of custom env var - Update documentation: F12 Network approach for cookie/UA extraction, PROXY_URI tip - Fix markdown structure: close Cookie tip block before UA tip
|
Successfully generated as following: http://localhost:1200/south-plus/forum/8 - Failed ❌http://localhost:1200/ /south-plus/forum/128 - Failed ❌ |
|
Successfully generated as following: http://localhost:1200/south-plus/forum/8 - Failed ❌ |
…in CI RSSHub globally overrides globalThis.fetch with proxy auto-injection via request-rewriter. Passing dispatcher: null (when PROXY_URI is unset) triggers undici internal assert(dispatcher) failure in GitHub Actions.
b4d3b7d to
c046e10
Compare
|
Successfully generated as following: http://localhost:1200/south-plus/forum/8 - Failed ❌ |
Co-authored-by: Tony <TonyRL@users.noreply.github.com>
|
Successfully generated as following: http://localhost:1200/south-plus/forum/8 - Failed ❌ |
- Use ofetch instead of native fetch - Simplify thread list selector per review suggestion - Only update author/pubDate from detail page when content is found - Move config block to correct alphabetical position
|
Successfully generated as following: http://localhost:1200/south-plus/forum/8 - Failed ❌ |
Auto ReviewNo clear rule violations found in the current diff. |
Thanks for the thorough review and deteailed suggestions! I've tried to address all the feedback:
The changes has been tested locally, the route works as expected with proper cookie/UA configuration. |
| async function handler(ctx) { | ||
| const fid = ctx.req.param('fid') ?? '8'; | ||
| const cookie = config.southplus.cookie; | ||
| const ua = config.southplus.ua || config.trueUA; |
There was a problem hiding this comment.
#22186 (comment) has not been resolved.
Does the site only work when using trueUA instead of RSSHub's default random generated UA?
There was a problem hiding this comment.
Does the site only work when using
trueUAinstead of RSSHub's default random generated UA?
No. The site works with any realistic browser UA. config.trueUA (RSSHub/1.0) is not required and in fact should NOT be used as a fallback, because it prevents the request-rewriter from generating a random browser UA.
Fix applied:
// Before:
const ua = config.southplus.ua || config.trueUA;
headers['User-Agent'] = ua;
// After:
const ua = config.southplus.ua;
if (ua) {
headers['User-Agent'] = ua;
}When SOUTHPLUS_UA is not set, the User-Agent header is omitted, letting the request-rewriter generate a random browser UA automatically.
| const author = $row.find('a.bl[href*="action-show-uid"]').text().trim(); | ||
|
|
||
| // Thread post date in column 2 (div.f10.gray2) | ||
| const postDateText = $row.find('div.f10.gray2').first().text().trim(); | ||
|
|
||
| // Last post date in column 4 (a.f10) | ||
| const lastPostDateText = $row.find('td.tal.y-style a.f10').last().text().trim(); | ||
|
|
||
| // Use last post date as pubDate for RSS sorting | ||
| const pubDate = parseDate(lastPostDateText) || parseDate(postDateText); | ||
|
|
||
| // Thread category tag (e.g. [自购], [公告]) in column 1 | ||
| const category = $row.find('a.s8').first().text().trim(); |
There was a problem hiding this comment.
Are you sure these selectors can match the content?
There was a problem hiding this comment.
Yes. Here's the HTML structure and extraction results for each board.
Selector-to-HTML mapping
| Field | Selector | HTML example |
|---|---|---|
| Title | $el.text().trim() (a[id^="a_ajax_"]) |
<a id="a_ajax_2882258"><b>Title text</b></a> |
| Category | $row.find('a.s8').first().text().trim() |
<a class="s8">[分类标签]</a> |
| Author | $row.find('a.bl[href*="action-show-uid"]').text().trim() |
<a class="bl" href="u.php?action-show-uid-428736">Author</a> |
| pubDate | $row.find('td.tal.y-style a.f10').last().text().trim() |
<a class="f10">2026-06-08 23:27</a> |
Test results across 4 boards (2026-06-08)
| Board | fid | Items | Categories extracted | Posts without category |
|---|---|---|---|---|
| Animation | 4 | 15 | [RAW], [3D动画], [MMD] |
omitted |
| Comics | 5 | 15 | [合集], [汉化区补档], [日文] |
omitted |
| ACG Discussion | 8 | 10 | [其它], [动漫], [cos] |
omitted |
| ASMR / Voice | 128 | 15 | [asmr录播], [同人音声], [音声汉化] |
omitted |
Raw HTML example (forum/128, post with category)
<tr align="center" class="tr3 t_one">
<!-- Column 0: status icon -->
<td>…</td>
<!-- Column 1: category + title -->
<td>
<a class="s8">[同人音声]</a> ← $row.find('a.s8')
<h3>
<a id="a_ajax_3373" href="read.php?tid-3373.html">
<b>[自购]title text</b>
</a>
</h3>
</td>
<!-- Column 2: author + post date -->
<td>
<a class="bl" href="...action-show-uid...">poster name</a> ← $row.find('a.bl[href*="action-show-uid"]')
<div class="f10 gray2">2026-06-07 10:55</div> ← $row.find('div.f10.gray2')
</td>
<!-- Column 3: replies/views -->
<td>…</td>
<!-- Column 4: last post date -->
<td class="tal y-style">
<a class="f10">2026-06-07 10:55</a> ← $row.find('td.tal.y-style a.f10').last()
</td>
</tr>| # | Title | author | pubDate | category |
|---|---|---|---|---|
| 1 | [自购]title text | poster name | Sun, 07 Jun 2026 | [同人音声] |
Raw HTML example (forum/4, sticky post, no category)
<td style="text-align:left;line-height:23px;" id="td_3373">
<img src="...headtopic_3.gif" title="置顶帖标志"/>
[08-20]
<h3>
<a href="read.php?tid-3373.html" id="a_ajax_3373">
<b><font color=#FF00FF>新人报道帖子(回帖已修复)</font></b>
</a>
</h3>
</td>→ No <a class="s8"> → category returns empty → omitted via category: category ? [category] : undefined
When SOUTHPLUS_UA is not configured, omit the User-Agent header entirely to let the request-rewriter generate a random browser UA. config.trueUA (RSSHub/1.0) is a crawler UA and may cause 403 errors with strict WAFs. Closes review comment DIYgod#7 in PR DIYgod#22186.
|
Successfully generated as following: http://localhost:1200/south-plus/forum/8 - Failed ❌ |
Involved Issue / 该 PR 相关 Issue
Close #4541
Example for the Proposed Route(s) / 路由地址示例
New RSS Route Checklist / 新 RSS 路由检查表
p-map并发控制 (concurrency=3)parseDate解析YYYY-MM-DD HH:MM)PuppeteerNote / 说明
为 South Plus(南+)添加 RSS 路由。
特性
SOUTHPLUS_COOKIE环境变量认证)cache.tryGet缓存p-map并发控制(每次 3 个请求)涉及文件
lib/config.tsSOUTHPLUS_COOKIE配置项lib/routes/south-plus/namespace.tslib/routes/south-plus/forum.ts测试
在docker环境中测试通过