Wikipedia:Bots/Noticeboard/Archive 13

Archive 10 Archive 11 Archive 12 Archive 13 Archive 14

User:RonBot trouble possibly in need of intervention

  Resolved – Actions have stopped, bot is unblocked. Categories can take a while to update, which is likely the root cause here. The file actually was deleted, so these actions were in scope - just stale. Local upload and protection, plus restoration and protection at commons have taken place to avoid issues with this specific placeholder image in the future. — xaosflux Talk 12:08, 20 March 2019 (UTC)

RonBot has been tagging a lot of pages with {{BrokenImage}} recently, with seemingly no broken images. I think this is due to the Commons file c:File:Blank.png being inadvertently deleted and promptly returned. It looks like it is used in a lot of infoboxes and the like. It looks like the bot is still tagging; I don't know why (maybe something not being updated instantly on our end?). Heaps of articles are now being tagged all the time. Could this require a shutdown? – Finnusertop (talkcontribs) 02:05, 20 March 2019 (UTC)

I've shut down the bot pending an investigation and I've notified the bot's owner. Any administrator is welcome to overturn the block and unblock the bot without my prior approval; if it should be unblocked, unblock it. Just let me know that you did so and what was found as far as the issue goes (if any was found). ~Oshwah~(talk) (contribs) 03:24, 20 March 2019 (UTC)
Moved from ANI now that the block is in place, so that the bot related issues can be addressed. — xaosflux Talk 03:29, 20 March 2019 (UTC)
Thank you for moving the discussion, Xaosflux. ~Oshwah~(talk) (contribs) 03:41, 20 March 2019 (UTC)
And thanks for the quick action @Oshwah:! I've asked for that image to be protected at commons while this is all figured out as well. — xaosflux Talk 03:42, 20 March 2019 (UTC)
Perfect; good call on the protection request. Hopefully this issue can be resolved quickly and without too much difficulty in modifying any code or process in order to fix it... :-) ~Oshwah~(talk) (contribs) 03:53, 20 March 2019 (UTC)
@Oshwah: Task 12 takes it's data from Category:Articles with missing files - that NOW has only 67 entries. Running bot with supervision to ensure it removes the unwanted entries. Looks like it's removing 9 entries a minute - it will take a while to finish. I've not changed the code - if a page gets added to the category then it will add the banner, when not in the category it removes the banner. Bot runs every 12 hours. I assume the category is populated by the wiki software, as nothing is added to the pages to put it in the category. Ronhjones  (Talk) 04:27, 20 March 2019 (UTC)
Oh, interesting... Thanks for responding with the in-depth explanation... :-) ~Oshwah~(talk) (contribs) 04:40, 20 March 2019 (UTC)
Blank.png was deleted at 20:00, 19 March 2019 and restored at 22:20, 19 March 2019. No idea why category was still filled at 01:00 when task 12 starts. Sadly the Commons Delinker bot only waits 10 minutes after a deletion. I protected the image on commons as "Highly visible image", but it does not really stop deletions... Ronhjones  (Talk) 04:39, 20 March 2019 (UTC)
I've checked the log files - The category had 1452 entries when Task 12 started at 01:00. Ronhjones  (Talk) 04:47, 20 March 2019 (UTC)
Task 12 has removed the banner from 682 pages. Ronhjones  (Talk) 05:01, 20 March 2019 (UTC)
Maybe we should host such critical image locally, marking them as not to be moved to commons. Ronhjones  (Talk) 05:07, 20 March 2019 (UTC)
I've done so. — JJMC89(T·C) 05:39, 20 March 2019 (UTC)
@Ronhjones and JJMC89: good idea for sure, I've also had commons admins protect this to avoid possible issues with phab:T30299 allowing a commons override in certain cases. — xaosflux Talk 12:03, 20 March 2019 (UTC)

User:Filedelinkerbot

  Resolved – No further action needed on this one. — xaosflux Talk 12:01, 20 March 2019 (UTC)

The issue also affects User:Filedelinkerbot, which has been unlinking this file from articles and templates, causing all sorts of layout problems. – Finnusertop (talkcontribs) 04:33, 20 March 2019 (UTC)

@Finnusertop: I've rolled back 240 pages that the delinker bot removed the image Ronhjones  (Talk) 04:59, 20 March 2019 (UTC)
This appears to have stopped, would like @Krd: to verify though. — xaosflux Talk 11:55, 20 March 2019 (UTC)
Filedelinkerbot has no backlog currently, so there shouldn't arise any more issues related to this file. --Krd 11:59, 20 March 2019 (UTC)

Bot-like user scripts

I'm not sure if this is the correct place to ask, but do user scripts that make many edits, with limited intervention from a user require a BRFA? I am asking because I have written a user script that bypasses the redirect created by a page move, if instructed. Once a user tells the script to make edits, there is no human intervention. WP:BOTSCRIPT states:

The majority of user scripts are intended to merely improve or personalize the existing MediaWiki interface, or to simplify access to commonly used functions for editors. Scripts of this kind do not normally require BAG approval.

However, this does not explain what to do for scripts that make edits with limited intervention. --Danski454 (talk) 14:10, 24 March 2019 (UTC)

@Danski454: the volume and impact of changes matter more then the mechanism of the change. As this would not be run from a bot account but from a regular editor account the primary concern would be if the edit should be made under a bot account to avoid being disruptive. What type of frequency and volumes would you expect to be making assisted edits? — xaosflux Talk 14:29, 24 March 2019 (UTC)
@Xaosflux: Running this from a bot account would hurt its usefulness as a script, at least for me. Regarding edit frequency and volumes, the script edits at a rate of 12 EPM, making an absolute maximum of 2,000 edits each time it is run, but it is unlikely that it would end up running that much, less than 100 edits each time is probably a closer estimate, with over 500 being very rare (as this requires many redirects, transclusions or links from templates). I would use this occasionally , mainly when moving a page away from an ambiguous title. --Danski454 (talk) 14:40, 24 March 2019 (UTC)
@Danski454: so the problem with throwing out 500 to 2000 edits is that you can flood watchlists and recent changes without the benefit of a bot flag. Think of this type of script use like people that use AWB. That being said having a "bot account" doesn't have to mean you need a server, advanced programming, etc - it can be as simple as having another logon that you load in another window to run the task. Noone would bat an eye if you ran this on 25 edits for example, of you ran it on 100 edits once every few months - it all becomes about volume and impact. A tangential issue to this is the general question if bypassing the types of redirects you would change in bulk (i.e. hundreds or thousands of updates) is something that is useful and strongly supported by most other editors; if it is then using a bot account also signals 'you don't need to worry about checking this' - if it isn't then it shouldn't be done at all. — xaosflux Talk 14:54, 24 March 2019 (UTC)
Considering this, I think AWB may be better suited for the task, a it allows review and is less disruptive. --Danski454 (talk) 16:09, 24 March 2019 (UTC)
Meanwhile, OneClickArchiver exists. —  HELLKNOWZ   ▎TALK 16:25, 24 March 2019 (UTC)
@Danski454: please keep in mind my note above that it is about the impact of actions, not the mechanism that most matters. Editors are welcome to make constructive edits using whatever method they want (web, api, AWB, scripts, etc) - but the same guidelines apply as to volume and types of changes. Likewise, making thousands of high frequency, repeated edits can be disruptive regardless of the tool - but running that tool under a bot flagged account can alleviate some of that concern. — xaosflux Talk 17:14, 24 March 2019 (UTC)
@Danski454: I run a number of tasks that are written as scripts but run through my bot. See User:DannyS712 bot/tasks tasks 3, 4, and 11 for approved tasks running via scripts. --DannyS712 (talk) 19:17, 24 March 2019 (UTC)

WP:URLREQ

I've created a Requests page for URL modifications related to link rot. Some bots/tools are generally approved for link rot work without going through BOTREQ for every domain change (currently WP:IABOT and WP:WAYBACKMEDIC). Obviously though any major scale change would need approval, for example modifying all of the NY Times links. URL changes are complex jobs requiring support for archive URLs (20+ archive providers not just archive.org), various templates and their parameters (CS1|2, {{webarchive}}, {{dead link}} etc), real-time detecting 404 and redirect status, etc.. and each request can have special conversion requirements.

URLREQ page does not replace BOTREQ, most requests will probably still arrive through BOTREQ, and elsewhere (talk pages, Village Pump etc), but it does help to keep these types of requests recorded on a single page so that the bot ops with the tools can better monitor scattered requests, not only on Enwiki but from other language wikis where the same URL changes would be applicable. Eventually a page like this on Meta for all projects might be created. -- GreenC 17:24, 27 March 2019 (UTC)

Breaking change for Wikidata descriptions

I don't know if anyone here will be affected by this, but there will be changes to Wikidata's database. If you know what this means, then this will affect you:

If you directly query the Labs database replicas for anything, then you need to update your code. The Wikidata development team will drop the wb_terms table from the database in favor of a new optimized schema on May 29th.

A new test system will be available on May 15th. You can read more on the mailing list. Whatamidoing (WMF) (talk) 03:05, 27 April 2019 (UTC)

Manually updating database reports

I have a question, because I'm a bit at a loss. I've recently been trying to create database reports that can help with wikignome tasks. Since I have absolutely no idea how to use toolforge for cron jobs, I've created the tasks as python code hosted on PAWS that I manually run each time to update the report. Below are the 3 tasks I've filed so far, and the result

So, what is the view of BAG in general about manually triggered database reports? I don't want it to be where the task is either speedy approved or denied based on who reviews it first. Thanks, --DannyS712 (talk) 05:01, 28 April 2019 (UTC)

I denied Wikipedia:Bots/Requests for approval/DannyS712 bot 32 because we came to the conclusion (mutually, I believe), that automation was needed, and moreover a lack of prior discussion about creating this report. You did not tell me about your other database report bots. I have reservations about those as well. No other report appears to be ran ad-hoc other than yours. As I said in the BRFA, I'm unaware of any strict rules, but I think we'd much prefer full automation for the official-looking WP:Database reports, which are meant to be reliably updated on a regular basis. If you were to go on a holiday to a tropical island, I doubt you'd want to be bothered with manually updating database reports :) Meanwhile consumers of these reports are quietly waiting your return. This is why we have bots.
I don't know how PAWS works but if you can make it set up a cron for those tasks, then that solves this issue. You have the Python code, so you're almost there... If you want I can give a quick run through of the steps you may need to get your bots on Toolforge. This would include some external learning resources. In the end, I think you'd be doing yourself a favour by letting your bots run independently of your sleep/holiday/real-life schedule :) MusikAnimal talk 05:36, 28 April 2019 (UTC)
@MusikAnimal: I've looked at toolforge, but I don't have the time to learn command-line syntax, etc. right now. Unfortunately, the conclusion was not mutual - I was agreeing that having it run automatically would be nice, but given that I am active almost every day I don't mind clicking a few buttons every few days to update the report. I agree that, in the end, I would be doing myself a favour, but in the middle (for now) I'd prefer a manual task over nothing. As for the official-looking database reports page, I note that many only run weekly, and 5 run even less frequently (4 of them only run once per month). Looking at the talk page, it seems that their isn't an official structure, but rather that it serves as a collection of individually maintained and updated (by bot) pages. --DannyS712 (talk) 05:43, 28 April 2019 (UTC)
Right, but those slower intervals are because they don't need to be updated more often, or that the queries they use take a long time to finish, etc. I see now there is at least one other editor who apparently is not using automation. So it would seem there is a larger discussion in store, one where I might possibly be in the minority.
Overall, let me make it clear there are no inherent rules being broken, as far as I can tell. I did deny your BRFA under the false assumption that you were in agreeance. For that, I apologize. But I do think the notion of manual WP:Database reports needs broader discussion. Maybe MZMcBride has an opinion?

As for your bot tasks, DannyS712: If you could (a) put your code on GitHub (public repo), (b) give me access to your Toolforge tool account, then (c) I can probably take care of the rest, showing you everything I did.

Above all else, it's much preferred to seek support for a new database report at Wikipedia talk:Database reports (though I admit the orphan/links report is surely useful for someone other than just yourself :)

Thanks for starting this discussion, as it is clearly needed. I was unaware others had approved non-automatic WP:Database reports. MusikAnimal talk 06:09, 28 April 2019 (UTC)

@MusikAnimal: I'll try to figure out how to do step b, and once I do I'll let you know. Thanks for the offer to help! However, no that I've cleared up the miscommunication with task 32, would you be willing to reconsider your decision? --DannyS712 (talk) 06:16, 28 April 2019 (UTC)
@DannyS712: Well, for starters, I still haven't seen anyone show support for it... That's probably easy to get. But at any rate you have most things scratched off of the Toolforge list. I don't see an urgency to write to WP:Database reports when we can do this the proper way, and with a proper BRFA to go with it. The bot userspace is always an option too, at least in the meantime! :) MusikAnimal talk 06:31, 28 April 2019 (UTC)
@MusikAnimal: I first posted 2 weeks ago at Wikipedia talk:Database reports#New reports, and since then there has been some support and no opposition. Until I figure out toolforge (thanks for the help with that) can you take a look and reconsider? --DannyS712 (talk) 18:18, 13 May 2019 (UTC)
  • @DannyS712: so as far as these go, once you publish something to WP:DBR - what ends up happening is other people rely on it, and rely on it being current. This is not along the lines of normal edits or bot tasks where we say that noone should even count on a future edit. Is it "right" - not sure, but it is what it is. You certainly should feel free to make all the reports you want, unless they are going to be "popular" and regularly maintained - putting them in your bot's userspace and just linking to a userspace index under the "Other reports" section on DBR may be best. For such reports, unless you are going to be updating them at some very high volume, you don't need BRFA's either. Is that guidance helpful? — xaosflux Talk 16:08, 28 April 2019 (UTC)
    Yes. I think I'll take MA up on their offer to help with toolforge - once its update automatically, it'll stay current. --DannyS712 (talk) 19:04, 28 April 2019 (UTC)

DannyS712, I wrote this for Kadane maybe it is of use. Any help let me know. --- GreenC 16:33, 3 May 2019 (UTC)

Permanent link, in case the page gets archived. eπi (talk | contribs) 05:00, 5 May 2019 (UTC)

In my experience, nobody really cares how a database report is being updated. For what it's worth, when I initially wrote these reports, I used this account ("MZMcBride") and there wasn't automation. I'm also not sure it needs to matter to a volunteer if users rely on a particular report. It's unreasonable to expect that a database report author needs to maintain the report, particularly when database report users will very often want the report updated indefinitely. That's a very long commitment! --MZMcBride (talk) 05:08, 6 May 2019 (UTC)

You can check out any time you like, but you can never leave... –xenotalk 18:26, 13 May 2019 (UTC)

Inactive bots - May 2019

Per the bot policy activity requirements, the following bots will be deauthorized and deflagged in one week. These bots have not made an edit in 2 or more years, nor have their operator made an edit in 2 or more years.

BOT_user_name BOT_last_edit Oper_username Oper_lastedit Notes
User:StatisticianBot 20161104065821 User:Dvandersluis 20161019
User:DYKReviewBot 20161030205038 User:Intelligentsium 20161030
User:DefconBot 20160902053125 User:A930913 20160403
User:Mr.Z-bot 20160830220939 User:Mr.Z-man 20160821
User:BracketBot 20160719215737 User:A930913 20160403
User:DrTrigonBot 20150617013726 User:DrTrigon 20160626
User:DixonDBot 20130329214425 User:DixonD 20170204
User:MGA73bot 20130202213645 User:MGA73 20160925
User:Lucia Bot 20121116225341 User:Beria 20170501
User:Ryan Vesey Bot 20120928012455 User:Ryan Vesey 20170308

Should an operator wish to maintain their bot's status, please place "keep" and your signature in the notes column above for your bot. Deauthorized bots will need to file a new BRFA should they wish to become reactivated in the future. Thank you, — xaosflux Talk 03:27, 14 May 2019 (UTC)

Required user talk notices left. — xaosflux Talk 03:33, 14 May 2019 (UTC)
Discuss
With no comments by the operators, the above accounts have had their bot flag removed. Primefac (talk) 20:22, 23 May 2019 (UTC)

Naming conventions

Okay, so before I go ahead and say something that either ends up causing more confusion, and/or making it seems like I'm the absolute authority in this case...

Background: I noticed at this BRFA that there was a question about keeping the absolute numerical numbering of bot tasks even though there are three different bots being numbered (i.e. "Task 13" is run as bot II's first task), and the above Task 38 might be run by bot III (even though it's the first bot run by that bot).

I suppose my question is, should we allow this sort of "absolute" numbering between different bots run by the same operator? Does it matter? As an arbitrary example, the following could be a set of task requests:

  • GenericBot
  • GenericBot 2
  • GenericBot 3
  • GenericBot II 4
  • GenericBot 5
  • GenericBot II 6

I genuinely don't have a position on this (and said so in this discussion), but as mentioned above I'd like to not give advice that's either contrary to what should be done and/or what people expect. Primefac (talk) 20:20, 23 May 2019 (UTC)

In my experience, each bot gets its own list of tasks. See User:BattyBot, for example. In the case that you linked, a single operator is proposing to run multiple bots (you might call these bots "siblings"). In that case, in order to preserve everyone's sanity, I would love to see a single User page for the set of bots, and a single list of tasks, even if different tasks are performed by different Bot-siblings. Just include a column in the table that shows which bot performs each task. In short, I think it would be the most sane option to do it as you have shown in the list above. – Jonesey95 (talk) 20:55, 23 May 2019 (UTC)
There's a very small amount of past precedent: AnomieBOT, MusikBot, ClueBot and SoxBot restart their numbering when they go to "II", but Cyberbot doesn't. I suppose "It doesn't really matter" is a valid outcome of this discussion. Primefac (talk) 21:06, 23 May 2019 (UTC)
@Jonesey95: Does User:DannyS712 bot/tasks meet your needs? Its transcluded by all of my bots --DannyS712 (talk) 21:08, 23 May 2019 (UTC)
Yes, nicely. Sorry for not even looking before posting the above. Please check to ensure that the bot user name is correct for each task. I think task 38 might need adjustment, or at least a tentative mark of some sort, given the BRFA discussion. – Jonesey95 (talk) 21:12, 23 May 2019 (UTC)
@Jonesey95: yes, the default is DannyS712 bot, and once any bot goes to trial I update it if that isn't right. I was going to use III for 40, but since that stalled i'll use it for 38. --DannyS712 (talk) 21:17, 23 May 2019 (UTC)
  • Case by case is the way to go here. First, it is OK to "skip" task numbers (especially as some may not get approved) so having Bot,Bot 2, BotB 3, Bot 4, BotB 5 isn't a problem that some were "skipped". In general, it is a good idea to have the BRFA subpage name==the bot actual name - over very long periods of time we've seen bots be renamed, bots change operators, etc. None of this is a "big deal". Within an actual BRFA, the actual bot name of whatever it is should always be used. If someone has a suite of bot accounts, using a centralized page name (that is redirected from the other accounts) is also OK. — xaosflux Talk 00:23, 24 May 2019 (UTC)

SineBot inactive since 2019-04-30

SineBot's latest contribution is from 2019-04-30T06:38:33. See User talk:Slakr#SineBot down.

@Masumrezarock100: I'd say the issue is not urgent enough to require other users' assistance. As the bot's code sadly is not available to the public, other users can not help by running the bot in the meantime.

~ ToBeFree (talk) 16:44, 26 May 2019 (UTC)

@ToBeFree: Is there any alternate bot to sign comments? It's a pain now that those unsigned comments are not automatically signed. Teahouse suffers the most. Masum Reza📞 16:49, 26 May 2019 (UTC)
{{Xsign}} makes it relatively easy compared to other templates, but I agree that SineBot is extremely useful. I miss it dearly. ~ ToBeFree (talk) 17:09, 26 May 2019 (UTC)
User:Anomie/unsignedhelper.js is great for unsigned comments. Primefac (talk) 12:40, 27 May 2019 (UTC)
  Resolved – SineBot is running again :) ~ ToBeFree (talk) 01:56, 29 May 2019 (UTC)

Bot unblocks

Hi there, we currently have two blocked bots with pending unblock requests. To a lay admin, it's unclear whether these bots can be unblocked outright at this point, or if we should get the nod from BAG first, or at least from a bot-experienced admin. Can someone who is more qualified handle these?

Thanks in advance, ~Swarm~ {sting} 19:21, 1 May 2019 (UTC)

Often for blocks of bots any admin can feel free to unblock when they believe the original problem won't reoccur when the block is lifted, see Wikipedia:Blocking policy#Blocks in temporary circumstances (second bullet). This can be as simple as the operator stating that they've disabled the offending code. On the other hand, if it was blocked as an unapproved bot there's usually no reason to unblock until BAG approves a trial as there's usually nothing the bot account is allowed to do that is prevented by the block.

BTW, if the operator is an admin, that may even implicitly allow the operator to do it themselves. Exactly how clear it has to be before that applies is arguable, safest is to only do so if the blocking admin explicitly said something like "feel free to unblock when that's fixed". Anomie 21:06, 1 May 2019 (UTC)

As far as Citation bot goes, it appears the block was for making inappropriate edits (that would have been inappropriate if made by a human editor as well) - assuming this is the situation, it needs to be resolved first. It appears there is still active discussion occurring about that point? — xaosflux Talk 22:08, 1 May 2019 (UTC)
You are incorrect, the complaint was that a very few links to copywrite infringling citeceerx references were added. That is long since fixed. The second complaint is that the bot does edits on its own without a human being getting credit. AManWithNoPlan (talk) 22:13, 1 May 2019 (UTC)
It is a minor complaint that doesn't cause any harm to the encyclopedia, but it is an annoyance. IMO that shouldn't be enough to maintain the block, but I'm pretty involved in the discussion, so I'm not keen to give the thumbs up myself. Someone else could though. Headbomb {t · c · p · b} 23:53, 1 May 2019 (UTC)
@AManWithNoPlan: can you elaborate? That bot is operated by Smith609, who is responsible for any edit made under that account. Is there a complaint that it is making edits outside of its approval, or that the approval needs to be revisted? — xaosflux Talk 00:20, 2 May 2019 (UTC)
People are able to request that the bot edit a specific page, but the bot does not force users to reveal their identities. AManWithNoPlan (talk) 00:27, 2 May 2019 (UTC)
OK? Is there a specific BRFA task # where you expect this to be occuring? Tt looks like it is using an ancient (in wiki time) passed-on BRFA (Wikipedia:Bots/Requests for approval/DOI bot) - which doesn't seem to require that. As long as Smith609 is taking responsibility for the edits, this doesn't seem to be a specific violation of anything, is there more to this? — xaosflux Talk 00:41, 2 May 2019 (UTC)

The bot has been approved many different times as features were added. Always approved to run as a pure bot. AManWithNoPlan (talk) 02:42, 2 May 2019 (UTC)

@AManWithNoPlan: OK - so can you point to what task you think is malfunctioning? — xaosflux Talk 03:26, 2 May 2019 (UTC)
I take it that part of the contention is that some people want a feature request added - @Headbomb: I think you are one of the requesters of this? I'm not seeing how this is a showstopper. If Smith609 is allowing his bot to make "bad edits" then, sure it should be stopped (and putting a name of who they made the bad edit on behalf of isn't really fixing that core problem is it?). Smith609 could work on fixing that lots of ways, including by only allowing certain users to trigger the bot. Are there some examples of bad edits that the bot is making for reference here? — xaosflux Talk 03:49, 2 May 2019 (UTC)
@Xaosflux: I am indeed one of those that requested that. The reasons mostly being that when it was adding CiteSeerX links, this would have been useful to WP:TROUT users that didn't review the bot's edits. This would still be useful for trouting people that don't review the bot's edits, or to help users that try to use the bot, but there is no outstanding/egregiously problematic behaviour in terms of actual edits (at least as far as I can tell). Headbomb {t · c · p · b} 03:56, 2 May 2019 (UTC)
I have several concerns about this bot relative to BOTPOL. First that its operator of record appears to be an absentee landlord: before the bot was blocked they had not edited any page related to the bot for several months (and ditto before that). There was a recent, contentious, RfC about the bot's behaviour in which they were pinged multiple times, but did not participate at all (despite editing elsewhere on the project). I have several times asked them (on their talk page and on the bot's talk page) to confirm whether they are in fact still the bot's operator, and to address WP:BOTACC, second para, and WP:BOTCOMM. There has been no response, beyond removing the question with the edit summary Archive aggressive comments. During the RFC the bot's proponents argued that since the bot was user activated it was not the bot that was responsible for the problematic edits (but it didn't identify the user activating the bot). Once the bot was blocked for, among other things, not identifying the activating user, the bot's operator suddenly chimed in claiming it was, in fact, the operator that was responsible for the edits (see also this thread at WT:BOTPOL). The net result seems to be that nobody takes responsibility for the edits.
Second that this bot relies on a 11 year old BRFA for a task to "Adds DOIs to citations provided using {{cite journal}}". But the bot has changed extensively in the decade since that BRFA and its maintainers now appear to operate on the assumption that it has de facto approval to do "anything at all that is related to citations" with no need for a new BRFA (in the above RFC they were asked several times for the BRFA that authorized mass-removal of valid citation template parameters, with the only response a suggestion that it was "grandfathered in" due to the bot's age, and besides, the bot's maintainers didn't think anyone ought to use those parameters in any case, nevermind CITEVAR). There are several concerns surrounding responsiveness to the community and only making uncontroversial edits. I think the bot's various tasks need to be analysed (documented), and which ones have actual authorisation sorted. Existing BRFAs: 1, 2, 3, 4, 5, 6, 7, 8, 9 (last one in 2011).
As for the two direct reasons the bot was blocked… As I understand it the addition of the problematic links has been stopped and this reason for the block resolved. In fairness it should also be noted that just how problematic these links are (or are not) and how our various policies applies to that issue is not clear cut, and the bot's de facto maintainer has taken steps to get that question resolved through community discussion at Wikipedia talk:Copyrights#CiteSeerX copyrights and linking with no real conclusion (ie. the community has failed to provide the necessary guidance).
The second reason is the issue of not attributing edits to the user responsible for them, which has not been resolved. The bot now has some mitigating functionality that, when I looked at it, at least included a form field to voluntarily provide a user name that the bot would then insert into the edit summary (see https://tools.wmflabs.org/citations/). I've not paid close enough attention to tell whether this is the sum of the changes made to address this issue (I would hope the operator would answer that question). Usually this might have been "good enough" for practical purposes, but given the problems outlined above, the bot's mode of operation, and the potential for abuse (for example, during the RFC there were—alleged, not substantiated—claims that there was a dramatic uptick in removing the citation parameters in question, in an apparent effort to get them removed before the community could prohibit that behaviour: i.e. anonymous mass edit-warring using the bot), my conclusion is that in order to actually meet the requirement for attributing edits to the user making them, the bot must implement some form of actual authentication. It doesn't absolutely have to be OAuth, but it needs to be something that achieves the same effect. It should probably also not allow blocked or non-logged-in users to use the bot. But to be absolutely clear: I am here talking about potential for abuse, not any actual ongoing abuse (the bot is currently blocked, after all), and the requirements of BOTPOL.
PS. Apologies for the wall of text. I know some people dislike that, but if I could have written this shorter I would have. --Xover (talk) 14:29, 2 May 2019 (UTC)
some times it takes a lot of text. AManWithNoPlan (talk) 14:39, 2 May 2019 (UTC)
Thank you for pointing out that the general community responce to my attempts at various times to start discussions has resulted in less than the sound of crickets. The whole citeceerx issue was never really resolved other than the bot stopping the addition since it is not a high value feature anyway. AManWithNoPlan (talk) 14:48, 2 May 2019 (UTC)
"There has been no response, beyond removing the question with the edit summary Archive aggressive comments". Sure, if you ignore the response, and the patently obvious evidence that Smith609 operates the bot. Headbomb {t · c · p · b} 15:35, 2 May 2019 (UTC)
Easy to miss a response when it's ten days later. Fewer such assumptions might not go amiss. ——SerialNumber54129 15:40, 2 May 2019 (UTC)
You appear to be confused. They have indeed edited, sporadically, in the interrim. As I wrote above. However the first diff you provide (which message was one of the ones included in the diff I provided) is a response to the blocking admin, not to my query, and addressed a completely unrelated issue. My query was a reply to that message and they have not provided any kind of response to that, much less actually addressed the query. The third party presuming to answer for the bot's operator (your second diff, message also included in my original diff) may well find the answer blindingly obvious, but I am not posessed of such powers of mind reading. By pure happenstance I am perfectly capable of interpreting various logs at Github, but that does not seem a reasonable requirement for resolving a question addressed to a bot's operator. Their ability to periodically click the one button it takes to merge a pull request on Github is also completely orthogonal to the question asked: do they, in fact, consider themselves the bot's operator—with the attendant responsibilities set out in BOTPOL—and can they address WP:BOTACC, second para, and WP:BOTCOMM. I do not feel that this is an unreasonable question to pose to a bot operator, and by implication of your argument it should not be a hard one to answer. And yet I have now literally waited months without an answer (the irony...). --Xover (talk) 16:34, 2 May 2019 (UTC)
Citation bot (talk · contribs) is operated by Smith609 (talk · contribs), as it made evidently clear by the prominent {{bot}} template featured on its user page, as required per WP:BOTACCOUNT. Headbomb {t · c · p · b} 17:14, 2 May 2019 (UTC)
That's a start. Let's get it to adhere to the rest of BOTPOL shall we. ——SerialNumber54129 17:35, 2 May 2019 (UTC)
What parts of BOTPOL do you feel the bot isn't adhering to? Headbomb {t · c · p · b} 17:37, 2 May 2019 (UTC)
All the more remarkable then, that after I have asked three times, over three months, they have still not managed to affirm this, much less respond to the rest of my question. In fact, you may feel free to consider the repetiton here the fourth time, in the fourth month, that I have asked the question. --Xover (talk) 18:01, 2 May 2019 (UTC)
That you refuse to hear the answer to your question is your problem. Headbomb {t · c · p · b} 20:22, 2 May 2019 (UTC)
You may want to take a look in the mirror when it comes to refusing to get the point. --Xover (talk) 06:14, 3 May 2019 (UTC)
I could be wrong about this but my understanding was that user-initiated tools are not subject to bot policy. This came up when a well-known user-initiated tool by a well known and respected bot op was causing problems, and the operator was not responding to fix requests. A BAG member told me that it was outside the responsibility of the BAG group, they could not block it. Since this is similar to what happened with Citation bot, I'm thinking I was given bad information. Are Tools that edit Wikipedia (on behalf of a user and triggered through a web interface) subject to bot policy? -- GreenC 20:52, 2 May 2019 (UTC)
For tools like WP:TWINKLE or WP:AWB, the answer is usually no, unless there are WP:MEATBOT concerns. Headbomb {t · c · p · b} 21:05, 2 May 2019 (UTC)
I mean tools like Citation bot and others like it, of which there are many, where the user initiates through a web interface and the bot edits on their behalf and the user (not the bot operator) is responsible for the edit -- this being the key difference from classic bots. -- GreenC 23:05, 2 May 2019 (UTC)

I am going to throw this out there. The bot is approved to edit any pages it wants to (as long as it does not have a no robots tag). In fact historically it did this and even did certain categories without being asked. So, the fact that we prioritize pages a human wants us to look at is a reduction in our authorized activities. I realize this is a highly technical interpretation of the rules, but some people want to follow the letter of the law, here you go. AManWithNoPlan (talk) 13:52, 3 May 2019 (UTC)

A quick reminder that I (like most people involved) actually have a day job, family, other volunteering, etc. Please everyone remember this during discussions, etc.. AManWithNoPlan (talk) 02:23, 4 May 2019 (UTC)


Please see this: https://en.wikipedia.org/wiki/User_talk:Citation_bot#The_current_block_is_not_well-founded_on_the_policy AManWithNoPlan (talk) 17:03, 29 May 2019 (UTC)

Issue with a bot, not finding the bot maintainer's response satisfactory, not sure what to do now.

See Wikipedia talk:Username policy#Utility of reports by DatBot. This bot reports users at UAA for being possible sock puppets. 99% of the time they are not blatant violations of the username policy, which is absolutely the only thing UAA is for. I was under the impression that bot tasks all had to be approved so I'm curious as to whether this specific task was approved and if so, why?

When questioned about it the reply from the bot's maintainer was along the lines of "I think I might remember why I did this and since many of the users wind up blocked it's clearly working."[1] I think this ignores several pertinent facts:

  • There is no evidence that the UAA reports are in any way what leads to eventual blocks, let alone global locks, of these accounts
  • If you don't seem to know why you coded the bot to do certain things, when those things are challenged it seems reasonable to change the bot's behavior
  • It's a blatant misuse of UAA to report socks there, it is in now way a forum for anything other than blatant violations of the username policy

I would therefore appreciate input in the discussion at WT:UPOL from BAG members about how to proceed here. I don't have anything against DatGuy and I'm sure his bots do lots of helpful things, but this particular thing does not seem helpful and just creates noise in an administrative area that regularly experiences backlogs. Beeblebrox (talk) 19:41, 30 May 2019 (UTC)

@Beeblebrox: can you place a couple of Diff's here for edits you think this bot is making that you don't think have an approved BRFA? Then the operator can be called to identify if there is a BRFA task approved for them or not. If there is, but things have changed over time you can ask for a re-evaluation of the BRFA here as well. As this is all "back page" type stuff, I don't suggest blocking while this is sorted out. — xaosflux Talk 20:08, 30 May 2019 (UTC)
All the reports are like this: [2]. The creation of an account trips an edit filter and the bot reports it to UAA. Typically these accounts have no edits, which is something we normally discourage reporting at UAA unless it is something like hate speech or attacks on specific people, all the information it provides is "possible sockpuppet creation" which isn't in any way a username violation. Beeblebrox (talk) 20:37, 30 May 2019 (UTC)
As it happens, it appears that DeltaQuadBot is not making reports right now, so for the last few days all the action WP:UAA/B has been this bot making reports and admins removing them without blocking because literally none of them are based in any way on the username policy. Beeblebrox (talk) 21:21, 30 May 2019 (UTC)
For the record, the relevant BRFA is this one, as a takeover of this one. Headbomb {t · c · p · b} 21:30, 30 May 2019 (UTC)
@Headbomb: is my reading of that that these UAA reports should only be getting triggered from filter 579? — xaosflux Talk 22:00, 30 May 2019 (UTC)
And if so, @Beeblebrox: can you think of a better place for "Possible sockpuppet account creations" to go (or do you think this is useless)? — xaosflux Talk 22:02, 30 May 2019 (UTC)
Without context I don't know how anyone can evaluate these reports other than the maintainers of that specific edit filter. It doesn't mention who's sockpuppet the reported user might be, and it reports them before they've actually done anything so there's nothing to go on without running a (completely unjustifiable) checkuser on them. If there was some way to post these reports to pertinent pages at SPI so that people who actually knew what the context was could review them that would be something, but they serve no purpose at all at UAA. Looking at that BRFA, it seems it was going to provide context by linking to relevant SPI pages, but it doesn't do that in practice. But even if it did, this just doesn't belong at UAA, it belongs at SPI if ti belongs anywhere. Beeblebrox (talk) 23:26, 30 May 2019 (UTC)
  • OK, so it seems this is "easy" to turn off (at User:DatBot/filters) - @DatGuy: looks like there is pushback on this one report job, do you have any issues with it being removed? (Perhaps one day a better filter can be made to feed that task). — xaosflux Talk 23:44, 30 May 2019 (UTC)
    • Also, if anyone thinks these are useful, perhaps they can go somewhere else? (You don't need a new BRFA to just change the target page if it is something easy, esp if it would be something like DatBot/PossibleSocks or something). — xaosflux Talk 23:46, 30 May 2019 (UTC)

The bot is still making these reports. Bot op has been intermittently active but has not responded here. I don't want to see this bot blocked, it does do a lot of good work against vandals, but this is just not acceptable. Beeblebrox (talk) 14:29, 16 June 2019 (UTC)

@Beeblebrox: I disabled that function by commenting out "579" from the line at User:DatBot/filters, at least pending engagement by the operator here for further discussion. I'm not sure how long it will take to go in to effect, but give it a at least a day to see if it helps? — xaosflux Talk 15:45, 16 June 2019 (UTC)
Appreciate that, thanks. Hopefully that's the end of it. Beeblebrox (talk) 16:05, 16 June 2019 (UTC)

New archive box

Thanks to Primefac for getting the ball rolling on this with the creation of Wikipedia:Bots/ArchiveBox. I spiffied it up a bit, and deployed it on all bot-related venues we have.

The functionalities are what you'd expect. The relevant section of the box automatically opens up, providing you with a search box specific to the venue you are at, with a general search box covering everything bot-related. Specifically, any pages that start with Wikipedia:Bot or Wikipedia talk:Bot, including things like Wikipedia:Bots/Requests for approval/Bibcode Bot.

Suggestions for improvements and general feedback welcomed, of course. Headbomb {t · c · p · b} 19:22, 23 June 2019 (UTC)

Looks nice! — xaosflux Talk 19:53, 23 June 2019 (UTC)

Remove bot flag?

User:Italic title bot has no edits since 2013. -- Magioladitis (talk) 17:11, 2 July 2019 (UTC)

@Magioladitis: (see also prior section) the current bot policy only forces removal if both the bot and operator are inactive. However if @~riley: isn't going to operate this anymore and asks, we certainly can mark it retired! — xaosflux Talk 17:13, 2 July 2019 (UTC)

Wikipedia:Bots/Status

Soooo... anyone feel like working on redoing Wikipedia:Bots/Status? I think it would be handy to have an index of all bots, but it will be a lot of work. — xaosflux Talk 01:15, 2 July 2019 (UTC)

It might be the perfect job for a bot. Someone should make a bot request. --Izno (talk) 01:58, 2 July 2019 (UTC)
Izno, someone should file a bot request to make bot requests, one to handle BRFAs, and another bot to write the bots. —CYBERPOWER (Around) 02:39, 2 July 2019 (UTC)
:) I mean, we already do have a bot which takes care of the WP:Bot requests table of contents, and this feels like a similar kind of request; a bot could keep track of recent changes for editing bots or something similar. An alternative implementation might be to request that bot ops, when they file their BRFA, to make a JSON representation or something of the tasks their bot is executing, which a bot could keep track of. --Izno (talk) 02:49, 2 July 2019 (UTC)
Could have a standardized template that can be placed on bot user pages to indicate tasks. Galobtter (pingó mió) 08:33, 2 July 2019 (UTC)
I went with Json as that's machine-readable, but that's another alternative. --Izno (talk) 13:14, 2 July 2019 (UTC)
@Galobtter: I've seen other projects do that, it it normally works pretty well as long as some exceptions are allowed for very complex bots with lots and lots of tasks - I don't think that solves a central database (table?) of bots ask though. Such a location could include every task summary and the status of each task (proposed/approved/completed/unapproved). — xaosflux Talk 13:31, 2 July 2019 (UTC)
We now have Category:Active Wikipedia bots. At some point we use categories to better work with these things. -- Magioladitis (talk) 13:41, 2 July 2019 (UTC)
Doesn't help for the "does someone have a bot that does x" or looking for what other bots might this new request conflict with type searches though. — xaosflux Talk 14:39, 2 July 2019 (UTC)

If we were to create a table of which bots do what, the category of approved BRFAs would be a good place to start - just create the first column based on that cat. Then, go through and mark whether it was a one-time-run or continuous. The latter group can then be further categorized/described. So yes, a lot of work. Primefac (talk) 16:03, 2 July 2019 (UTC)

I made a start on something similar a while ago, taking all of the approved BRFAs and sorting them by bot. I should be able to post something within a few days.— DannyS712 (talk) 16:06, 2 July 2019 (UTC)

I can certainly cleanup the list. How many years of no editing is considered ad "inactive"? I recall we had a rule on when to remove bot flag. -- Magioladitis (talk) 17:03, 2 July 2019 (UTC)

I think this is more for listing the tasks being performed by active bots (not necessarily listing just the bots). For example, Task 30 by my bot for dealing with deprecated/broken parameters in templates, or Task 2 which disables cats on draft pages. Primefac (talk) 17:06, 2 July 2019 (UTC)
User:DannyS712/sandbox10 is a list of all approved BRFAs, as of 14 May 2019. I'm organizing it by bot. Once that is done, separating the "active" bots from the "inactive" bots results in 2 lists: a list of tasks that are completed or still running, and a list of tasks that either were completed or have standing approval but are no longer being actively done by current bots. --DannyS712 (talk) 17:09, 2 July 2019 (UTC)
To get to the point where they need a new BRFA it is very lenient and we usually batch process them twice a year. The current policy require both the bot AND the operator (would be nice to have these on a table :D ) to be 100% inactive for 2 years to force retire a bot. Though if a bot only had one-off-tasks and they were all completed it would be feasible to mark them inactive and deflag them as well. — xaosflux Talk 17:11, 2 July 2019 (UTC)
Xaosflux do you recall when was the last check for inactive bots? -- Magioladitis (talk) 18:42, 2 July 2019 (UTC)
May 2019. Primefac (talk) 18:46, 2 July 2019 (UTC)

I found many inactive bots (more than 5 years with no edits) that their owners are active. I wonder if we should at least ask bot owners if they are OK to have the bot flag removed for security reasons. I took the liberty to ask User talk:Traveler100. -- Magioladitis (talk) 18:52, 2 July 2019 (UTC)

xaosflux Can you please remove the flags from Traveler100's bots? I contacted them and they agree. -- Magioladitis (talk) 23:41, 3 July 2019 (UTC)

@Magioladitis:   Done per the operator's request. — xaosflux Talk 23:51, 3 July 2019 (UTC)
I did that a maybe 2 years ago, its fine to check in with the inactive operators periodically - they can always re-BRFA, not likes its RfA! — xaosflux Talk 23:52, 3 July 2019 (UTC)

User:KolbertBot is malfunctioning - breaking Archive URLs

  Moved from WP:ANI
BOT: KolbertBot (t · c · del · cross-wiki · SUL · edit counter · pages created (xtools • sigma· non-automated edits · BLP edits · undos · rollbacks · reviews · logs (blocks • rights • moves) · rfar · spi) (assign permissions)(acc · ap · fm · mms · npr · pm · pcr · rb · te)
OP: Jon Kolbert (t · c · del · cross-wiki · SUL · edit counter · pages created (xtools • sigma· non-automated edits · BLP edits · undos · rollbacks · reviews · logs (blocks • rights • moves) · rfar · spi) (assign permissions)(acc · ap · fm · mms · npr · pm · pcr · rb · te)

I just noticed this on something on my watchhlist, and I suspect it's a problem everywhere now. While performing "Task #2 : Remove link referral data", the bot removes the referral information from the URLs. Which is fine for most normal URLs, but if the archive URL had referral data, it now doesn't work. Example:

https://en.wikipedia.org/w/index.php?title=Chinese_People%27s_Liberation_Army_Support_Base_in_Djibouti&curid=54667043&diff=904972637&oldid=903790483

Replaced archive-url=https://archive.today/20181206171026/https://thediplomat.com/2018/12/chinas-djibouti-base-a-one-year-update/?utm_source=Sailthru&utm_medium=email&utm_campaign=ebb%2006.12.18&utm_term=Editorial%20-%20Military%20-%20Early%20Bird%20Brief

with

archive-url=https://archive.today/20181206171026/https://thediplomat.com/2018/12/chinas-djibouti-base-a-one-year-update/

Well intentioned, I'm sure, but the new link does not work. Which rather defeats the whole point of having an archive-url to begin with.

Could we consider disabling this feature on "|archive-url" for the time being? Or somehow preventing it from mangling them?

PvOberstein (talk) 03:12, 6 July 2019 (UTC)

"archive.today" appears to be the only archive "service" that incorrectly uses the URL tracking parameters in this manner. (See for example https://webcache.googleusercontent.com/search?q=cache:https://thediplomat.com/2018/12/chinas-djibouti-base-a-one-year-update/ or http://web.archive.org/web/20190626001057/https://thediplomat.com/2018/12/chinas-djibouti-base-a-one-year-update/ ) Have you notified the bot's operator? ST47 (talk) 03:46, 6 July 2019 (UTC)
@ST47: I believe I have now, thank you. PvOberstein (talk) 14:55, 6 July 2019 (UTC)
I wouldn't assume it is the only one (we use 20-some archive providers), or that it is "incorrect", just how archive.today does things. All the providers have features. -- GreenC 15:34, 6 July 2019 (UTC)
  • Operator Jon Kolbert has been notified. @PvOberstein: are you seeing malfunctions at a high rate, such that blocking may be needed prior to giving the operator a chance to review? — xaosflux Talk 15:23, 6 July 2019 (UTC)
@Xaosflux: I've only seen it on few pages on my personal Watchlist (since I use archive.today a fair bit) such as in this example, but have yet to encounter it in the wild. I'll defer to more experienced hands as to whether it's a severe enough problem to necessitate blocking. PvOberstein (talk) 15:34, 6 July 2019 (UTC)
The problem is still ongoing diff. -- GreenC 15:42, 6 July 2019 (UTC)
  • Jon and I discussed this before. He said in December 2018, "I have added the necessary adjustments" to avoid modifying archive URLs to 20-some archive providers. -- GreenC 15:27, 6 July 2019 (UTC)
  • @PvOberstein: regarding "the new link does not work" - when I'm checking right now it also appears the old link doesn't work either - is it working for you? — xaosflux Talk 16:06, 6 July 2019 (UTC)
[3] works (for me). -- GreenC 16:08, 6 July 2019 (UTC)
  • Wait what? From your example above, is the bot changing "archive.fo" links to "archive.today", if not what does that have to do with this? — xaosflux Talk 16:15, 6 July 2019 (UTC)
  • archive.today is an alias to archive.fo etc.. they have multiple alias domains. -- GreenC 16:19, 6 July 2019 (UTC)
@GreenC: Hmmm, in your other example of old link to new link, both links are failing for me right now as well. Any chance there is an issue going on with this provider? — xaosflux Talk 16:10, 6 July 2019 (UTC)
Provider is ok looks like something on your end. -- GreenC 16:18, 6 July 2019 (UTC)
archive.today is breaking dns via cloudflare

> server 8.8.8.8 Default Server: dns.google Address: 8.8.8.8

> archive.today Server: dns.google Address: 8.8.8.8

Non-authoritative answer: Name: archive.today Address: 51.38.113.224

> server 1.1.1.1 Default Server: one.one.one.one Address: 1.1.1.1

> archive.today Server: one.one.one.one Address: 1.1.1.1

Non-authoritative answer: Name: archive.today Address: 127.0.0.3

  • Further research, CloudFlare is just fine, the archive.today people are purposefully giving bad dns responses to anyone trying to resolve them via cloudflare. — xaosflux Talk 02:35, 9 July 2019 (UTC)
  • OK, so Cloudflare is being sucky, but yes confirmed this is causing "breaking" changes. Really, the archiving services should work better, but we can't control that, we can only control our changes - and the changes this bot are making are currently making the article worse for readers. Would like to give the operator a chance to reply before we apply heavy measures (blocking). — xaosflux Talk 16:26, 6 July 2019 (UTC)
    FYI: Additional research indicates cloudflare is working fine, this archive.today people are intentionally breaking dns when cloudflare dns is used to look them up, so archive.today is the one being sucky. — xaosflux Talk 02:35, 9 July 2019 (UTC)
    FYI: Nyttend blocked the bot. — xaosflux Talk 23:11, 6 July 2019 (UTC)

[edit conflict] Since it's been more than a day since Jon Kolbert last edited, and since the bot was still editing today, I've blocked it. Maybe the fix will be really simple, so I've told him basically "you may unblock this bot when you think it's fixed". The point of WP:NEVERUNBLOCK is to stop disruptive unblocking, and as applied to bots it's to prevent someone from unblocking his bot against opposition from the blocking admin and others (e.g. to prevent wheel wars). I just want him to address the bot's behavior before it makes any more edits, and that's why I'm fine with him unblocking at will. This will not be a disruptive self-unblock, and it's one of those rare cases where we can ignore the rules to make things work more smoothly. Of course, any other admin should feel free to unblock at will. Nyttend (talk) 23:17, 6 July 2019 (UTC)

  • Thank you, Anomie; I'd never noticed that. I figured that unblocking your own bot was always inappropriate, unless you'd blocked it or you had a reasonable IAR justification like this one. Nyttend (talk) 22:06, 8 July 2019 (UTC)

Another case from July 5. WaybackMedic has been finding and deleting broken archive.today links for months, I assumed it was user entry error, but now believe many are due to KolbertBot. There must be thousands given how many I found and the happenstance of two bots editing the same articles. -- GreenC 15:29, 7 July 2019 (UTC)

User:RonBot #11

Hello, As others have noted at Wikipedia:Administrators'_noticeboard/Archive309#User:RonBot, User:RonBot and its creator User:Ronhjones have not been active since the first week of April this year, although most of his bots are marked as Running. I am particularly interested in #11, which searched declined AfC submissions for biographies of women, and added newly declined drafts to Wikipedia:WikiProject Women in Red/Drafts once a week. I was able to use it to develop over a dozen declined drafts to acceptable articles about notable women, but without the bot, there is no way of identifying drafts relevant to the Women in Red project. Is it at all possible for someone else to operate the bot, or to check why it's not running automatically, even though it says it is? It would be very beneficial to have it active again. Thanks, RebeccaGreen (talk) 16:25, 8 July 2019 (UTC)

Hello @RebeccaGreen:, only Ronhjones could can address issues with their bot. I sent them an email about this discussion as well. We are unable to make their bot make edits. If there is no response you can request someone else make a clone (copy) of that bot to do the same task at WP:BOTREQ. Thanks, — xaosflux Talk 16:35, 8 July 2019 (UTC)
Hmm they seems to have disabled their email, so it never went. — xaosflux Talk 16:36, 8 July 2019 (UTC)
Thanks for your help, I'm glad to know what to do. I think I will have to request a clone, as it's been three months, and the email being disabled is not a hopeful sign. Cheers, RebeccaGreen (talk) 16:40, 8 July 2019 (UTC)
For the record here's the BRFA, which has the source code attached. Primefac (talk) 19:44, 8 July 2019 (UTC)

I just want to remark that having bots run for months and not being around is clearly against BOTPOL, but we have so many of those and generally nothing is done about until it goes wrong. —  HELLKNOWZ   ▎TALK 20:11, 8 July 2019 (UTC)

We can have a bit of WP:IAR leeway in the case of correctly functioning bots that aren't causing issues. Headbomb {t · c · p · b} 09:48, 12 July 2019 (UTC)
Hellknowz, I will say if the bot is doing what it's supposed to, then don't try and break it by stopping it. We shouldn't bother to hunt down orphaned bots if they are still doing a good job at what they're supposed to be doing. —CYBERPOWER (Chat) 23:13, 12 July 2019 (UTC)

Midnight rollover fixes

  Moved from Wikipedia talk:Bots/Requests for approval § Midnight rollover – —⁠andrybak (talk) 09:24, 12 July 2019 (UTC)

I've been thinking of trying out Pywikibot on Wikipedia. Before writing any code and starting a BRFA, I would like to ask bot users about a possible bot task. Editors who live in timezones close to UTC±00:00, are likely to hit an unfortunate point, where an XfD page is created before midnight, but notifications are sent out after midnight. This leaves links which lead to empty (if pages are created by a bot, like Redirects for discussion pages by User:DumbBOT) or non-existent "next day" discussion pages. I've noticed once this issue in other editor's notifications, and got hit by it today. A possible algorithm could be:

  1. for bot created pages, check if "next day" page has been created
  2. if the page is not created or the only change is creation by the bot, the go to the next step
  3. go through user (and sometimes WikiProject) talk pages from Special:WhatLinksHere
  4. and try to substitute instances of "yyyy Month d" to previous date

Would a bot that fixes these issues be useful? —⁠andrybak (talk) 00:23, 12 July 2019 (UTC)

If the discussion pages is already not empty (new discussions were added between the actions of unfortunate user), the bot could also compare the link to both yesterday and today's section titles. But that's could be construed as violating WP:CONTEXTBOT. —⁠andrybak (talk) 11:03, 13 July 2019 (UTC)

User:Marianne Zimmerman

There is nothing else for us to do here at BOTN, please follow up at the discussion(s) on the other pages. Thank you for the notice. — xaosflux Talk 13:49, 14 July 2019 (UTC)
The following discussion is closed. Please do not modify it. Subsequent comments should be made on the appropriate discussion page. No further edits should be made to this discussion.

Cross-posted here, at User talk:Marianne Zimmerman, at User talk:Citation bot, and at User talk:Smith609

This account has made tens of thousands of edits by proxy using the Citation bot. It is still ongoing while I'm writing this. The account itself has made only 11 edits so far.

It is obvious that this 'Marianne Zimmerman' account is a bot, since it is working around the clock, 24/7. The account is not labeled as such, and has not been authorized by the Bot Approvals Group. In itself not a big deal, because the account has been making only positive edits and has not caused disruption. Still, it is technically violating policy, and I'm wondering why a bot would use another bot to make bot edits. That seems rather silly. I hope the author of the 'Marianne bot' can come forward so that we can work things out. Cheers, Manifestation (talk) 12:04, 14 July 2019 (UTC)

Account has been blocked. - Manifestation (talk) 13:44, 14 July 2019 (UTC)

The discussion above is closed. Please do not modify it. Subsequent comments should be made on the appropriate discussion page. No further edits should be made to this discussion.

Community Tech bot - Popular pages, stopped updating

Greetings, At VPT I posted a notice here stating the bot has stopped updating "Popular pages" 12:08, 13 July 2019. Further investigation is needed to get the bot running again. Regards, JoeHebda (talk) 15:07, 15 July 2019 (UTC)

Template:Bot discussion

I've started a discussion at Template_talk:Bot#status=expired which I'd appreciate some input on. Apologies for cross-posting. --kingboyk (talk) 23:45, 15 July 2019 (UTC)

WP:SCRIPTREQ

I added the following hatnote

to the WP:BOTREQ header. I've been here for 12+ years, and today I thought, hey, why don't we have a script request page like we do for bots? That seems odd that no one thought of that before?

And what do you know, there is one. So now the children will see it and be joyful, and their hearts will rejoice. Headbomb {t · c · p · b} 21:27, 26 July 2019 (UTC)

CfD: Category:Indefinitely blocked Wikipedia bots

As a follow up to the discussion that transpired at Template talk:Bot#status=expired. Headbomb {t · c · p · b} 21:32, 26 July 2019 (UTC)

New BAG nomination: Enterprisey

Hi! This is a notice that I have nominated myself for the Bot Approvals Group. I would appreciate your input. Thanks! Enterprisey (talk!) 06:15, 31 July 2019 (UTC)

BOTPOL regarding triggering users

Hello all, please see Wikipedia_talk:Bot_policy#Bots_triggered_by_multiple_users for a discussion on codifying expectations for bots triggered by others. — xaosflux Talk 13:57, 7 August 2019 (UTC)

Bots Newsletter, August 2019

Bots Newsletter, August 2019

Greetings!

Here is the 7th issue of the Bots Newsletter, a lot happened since last year's newsletter! You can subscribe/unsubscribe from future newsletters by adding/removing your name from this list.

Highlights for this newsletter include:

ARBCOM
  • Nothing of note happened. Just like we like it.
BAG

BAG members are expected to be active on Wikipedia to have their finger on the pulse of the community. After two years without any bot-related activity (such as posting on bot-related pages, posting on a bot's talk page, or operating a bot), BAG members will be retired from BAG following a one-week notice. Retired members can re-apply for BAG membership as normal if they wish to rejoin the BAG.

We thank former members for their service and wish Madman a happy retirement. We note that Madman and BU Rob13 were not inactive and could resume their BAG positions if they so wished, should their retirements happens to be temporary.

BOTDICT

Two new entries feature in the bots dictionary

BOTPOL
  • Activity requirements: BAG members now have an activity requirement. The requirements are very light, one only needs to be involved in a bot-related area at some point within the last two years. For purpose of meeting these requirements, discussing a bot-related matter anywhere on Wikipedia counts, as does operating a bot (RFC).
  • Copyvio flag: Bot accounts may be additionally marked by a bureaucrat upon BAG request as being in the "copyviobot" user group on Wikipedia. This flag allows using the API to add metadata to edits for use in the New pages feed (discussion). There is currently 1 bot using this functionality.
  • Mass creation: The restriction on mass-creation (semi-automated or automated) was extended from articles, to all content-pages. There are subtleties, but content here broadly means whatever a reader could land on when browsing the mainspace in normal circumstances (e.g. Mainspace, Books, most Categories, Portals, ...). There is also a warning that WP:MEATBOT still applies in other areas (e.g. Redirects, Wikipedia namespace, Help, maintenance categories, ...) not explicitely covered by WP:MASSCREATION.
BOTREQs and BRFAs

As of writing, we have...

  • 20 active BOTREQs, please help if you can!
  • 14 open BRFAs and 1 BRFA in need of BAG attention (see live status).
  • In 2018, 96 bot task were approved. An AWB search shows approximately 29 were withdrawn/expired, and 6 were denied.
  • Since the start of 2019, 97 bot task were approved. Logs show 15 were withdrawn/expired, and 15 were denied.
  • 10 inactive bots have been deflagged (see discussion). 5 other bots have been deflagged per operator requests or similar (see discussion).
New things
Other discussions

These are some of the discussions that happened / are still happening since the last Bots Newsletter. Many are stale, but some are still active.

See also the latest discussions at the bot noticeboard.

Thank you! edited by: Headbomb 17:24, 7 August 2019 (UTC)


(You can subscribe or unsubscribe from future newsletters by adding or removing your name from this list.)

Civil parish bot

There is a discussion at Wikipedia:Village pump (proposals)#Civil parish bot help (if someone can code a bot ready) for this would be appreciated, thanks. Crouch, Swale (talk) 11:06, 13 August 2019 (UTC)

User:Legobot Request

I have tried to contact User:Legoktm regarding a request for User:Legobot found here on July 24. Talk page message to them on July 30 which was archived without a response here. Another ping today and note on their talk page here. Although I understand and appreciate that their is no compulsory requirement to edit, per WP:BOTCOMM I expect bot questions to be addressed promptly. This is especially true for a bot like Legobot that manages so many important processes. So, my question to this group is how to best manage this issue? « Gonzo fan2007 (talk) @ 20:00, 15 August 2019 (UTC)

  • Thanks for posting @Gonzo fan2007: if I understand the need correctly: the GAN process would like to make a change, and wants to ensure it is coordinated with Legobot Task 33, yes? — xaosflux Talk 20:09, 15 August 2019 (UTC)
    • Yep Xaosflux. In the past there has been a desire to split up topics in sub-topics (among other feature requests). If you look at WP:GAN real quick, you will see for example that the Social sciences and society topic has many sub-topics, whereas Sports and Recreation has none. However, Legobot manages this whole process, thus no changes can be made until Legobot is updated. The process for how WP:GAN works can be found at WP:GAN/I, specifically how to categorize the article when adding {{GAN}}. « Gonzo fan2007 (talk) @ 20:16, 15 August 2019 (UTC)
      • Thank you @Gonzo fan2007:. Talk notice and email notice sent to the operator. — xaosflux Talk 21:41, 15 August 2019 (UTC)
        • Thanks Xaosflux. « Gonzo fan2007 (talk) @ 21:45, 15 August 2019 (UTC)
          • Hasn't edited in 3 weeks and no (obviously) replies to this issue, is there a way someone else can make the desired changes to the script, outside of disabling the bot ? - FlightTime (open channel) 22:11, 15 August 2019 (UTC)
            • The bot can be turned off for just certain pages with {{bots|deny=Legobot}} - so if could be disabled on just certain GA pages. — xaosflux Talk 23:01, 15 August 2019 (UTC)
              • The technical item that I am requesting is a subtle change to how the bot responds to the user input when adding {{GAN}} to a talk page to nominate an article for GA status. As part of adding that template to the talk page, a user has to add the |subtopic= from the list of subtopics found on the documentation sub-page of {{GAN}}. Based on this human input, the bot lists the page at WP:GAN under the correct sub-topic. This helps reviewers focus their efforts on smaller categories of articles. However, the topic of Sports and Recreation (which spans a lot of articles from city parks to the Premier League) has gotten so large (75 articles right now) that it is difficult to focus on any smaller topics. So, what would need to be done for this request is a set of human edits to {{GAN}} (new sub-topic options) and WP:GAN (new sub-topic section headers and categorization of the existing nominations), which I am happy to do. However, I imagine Legobot probably will need to add these new sub-topics to its code to be able to properly respond to these new sub-topics. There are probably other human edits that will be needed as well ({{GA}} will also need to be edited to add the new sub-topics; new categories will need to be created, etc). I don't have the technical expertise to grasp how difficult these changes would be to the bot, but am willing and happy to make the human edits. « Gonzo fan2007 (talk) @ 23:22, 15 August 2019 (UTC)
I stopped responding a while back to all the GAN requests unfortunately. I'm basically keeping it alive in maintenance mode and accepting patches. I'm open to having someone else take it over entirely, but last time there was some issue in transferring the database, I don't really remember. Legoktm (talk) 21:56, 16 August 2019 (UTC)
@Legoktm: While you're alive, could you take a look at Wikipedia_talk:Dashboard#Bots_noticeboard_not_working? and Wikipedia_talk:Dashboard#Bot_section?. Headbomb {t · c · p · b} 22:27, 16 August 2019 (UTC)
Thanks for the response Legoktm. Xaosflux, what's the best way to proceed? The WP:GAN process is pretty significant; ideally the bot that runs that page would have an owner that can make changes/updates/etc. Is there somewhere that we can post for any interested bot owners to take over this task? « Gonzo fan2007 (talk) @ 14:42, 19 August 2019 (UTC)
@Gonzo fan2007: WP:BOTREQ is where you could ask for someone to take it over. As far as what to do until then - we could stop the bot on pages that you want to change that it will not be compatible with (see the nobots directive note above). Once a new bot takes over the process, this task could certainly be shut down and the operator seems to be fine with that approach. — xaosflux Talk 15:12, 19 August 2019 (UTC)
Posted at Wikipedia:Bot requests#Operator to take over Legobot Task 33. There's no point in stopping the bot and making changes unless the bot code has been updated first. The bot would merely supersede any changes at its next edit. Thanks for the assistance Xaosflux. « Gonzo fan2007 (talk) @ 16:06, 19 August 2019 (UTC)

Template:BRFA help

New to bots on Wikipedia? Read these primers! [Hide this box]

I just created this box you see on the right, similar to {{AFD help}}. It will automatically show on individual BRFA pages, but you can suppress it in your skin if you want (see the 'Hide this box' instructions in the corner of the box). The goal is mostly to make the BRFA pages more accessible to non-technical folks that might be asked to give opinions on certain bot tasks.

Tweaks/improvements welcome. Headbomb {t · c · p · b} 16:26, 18 August 2019 (UTC)

I don't necessarily disagree with this template being created, but I strongly disagree with it being included in {{newbot}} (seen in this previous version); for the vast (vast) majority of BRFAs it will be completely unnecessary (I think we've had one new bot in the last two months?). Primefac (talk) 18:47, 18 August 2019 (UTC)
The point is that everyone is welcomed to comment, including newbies, and we ought to make things as friendly as possible for everyone. Including the help box in BRFAs is of no detriment to anyone. Headbomb {t · c · p · b} 20:17, 18 August 2019 (UTC)
(edit conflict) In principle it's not a bad quick link reference, but I feel it's pointless for the BRFA audience in the depths of technical WP pages. That said, I wouldn't mind if it was added. It doesn't make anything worse even if the benefit is marginal in case someone that new comes around to comment. —  HELLKNOWZ   ▎TALK 20:34, 18 August 2019 (UTC)
I like it. It also helps the old-timers who left and came back years later, to remember the details of how things work.  ‑Scottywong| chat _ 03:24, 20 August 2019 (UTC)

About Legobot and "Bots noticeboard not working" - it is a crontab issue

This noticeboard shows up has having no threads when seen at Wikipedia:Dashboard. Wikipedia:Dashboard/Administrative noticeboards is updated by Legobot. The code for doing this is not within Legobot itself. Instead Legobot is run hourly from a crontab and the list of noticeboards to read from is listed in that file. The crontab still says to read from Wikipedia:Bot owners' noticeboard which became a redirect when this noticeboard was moved to "Wikipedia:Bots/Noticeboard". Since nothing is changing at the redirect, the noticeboard appears to have nothing on it. We ran into the same problem when the Teahouse was moved, but the crontab was fixed then. I ran into the crontab (or a copy of it) by chance last summer when exploring how Wikidata was first populated by bots and how it is updated now. The editor involved was on vacation and I have now forgotten where I saw it and who the editor was. Can anyone here point me to someone to contact about this? StarryGrandma (talk) 16:27, 26 August 2019 (UTC)

MfD of a Bot’s page

User:ListeriaBot writes irregularly several times per month to Wikipedia:ORCID/Items_with_ORCID_identifiers, and that page is read ~ 5-15 times per day, every day. User:ListeriaBot is controlled by Magnus Manske (talk · contribs) who has not edited for 3 months.

Wikipedia:ORCID/Items with ORCID identifiers has been nominated for deletion at Wikipedia:Miscellany for deletion/Wikipedia:ORCID/Items with ORCID identifiers. Please comment there. —SmokeyJoe (talk) 10:30, 22 September 2019 (UTC)

Operator notice and email left. — xaosflux Talk 00:04, 23 September 2019 (UTC)
The page and the template that invokes ListeriaBot were created by Daniel Mietchen. --Magnus Manske (talk) 07:30, 24 September 2019 (UTC)

User:UTRSBot

This bot appears to have been down for a few weeks and it's maintainer is mostly inactive. I for one found it very handy to have on-wiki records of UTRS appeals so that transparency is maintained for the appeals process. Beeblebrox (talk) 20:23, 24 September 2019 (UTC)

@Beeblebrox: thanks for the note. There really isn't anything we can do about someone not running a bot; I'll try contacting the operator though. If it stays down you can request someone make a replacement bot at WP:BOTREQ. — xaosflux Talk 20:24, 24 September 2019 (UTC)
Out of curiosity, why don't we get another botoperator running the same code since all of it is published on github? I've noticed that this is a remarkably uncommon solution and guess there's some underlying reason. --Trialpears (talk) 20:44, 24 September 2019 (UTC)
@Trialpears: it is normally a fine solution, just most of the time noone wants to actually step up to be a bot operator for a continuously-running bot. — xaosflux Talk 22:32, 24 September 2019 (UTC)
Bot operators/maintainers generally have to understand the language the bot's written in. In this case, that's PHP. Proper maintenance of this bot also would require adminship. That cuts down your potential candidates even further. --AntiCompositeNumber (talk) 22:55, 24 September 2019 (UTC)
I think both CBP and MagikAnimal are both conversant in PHP, but I do not know if they would have bandwidth given the other things on their plate. --Izno (talk) 15:57, 25 September 2019 (UTC)
If need be, I'm familiar with PHP, operating bots, and UTRS. SQLQuery me! 18:52, 25 September 2019 (UTC)
Izno, I can 'run' it in place of another operator, but I won't be able to maintain it. You are right my plate is full right now. I'm hardly finding time to rewrite my RFPP task to conform to the new format the community wants. —CYBERPOWER (Chat) 20:48, 26 September 2019 (UTC)
Ah, so that task hasn't been completely forgotten about. * Pppery * it has begun... 01:13, 27 September 2019 (UTC)
  • Hi Guys. Everyone is right, I'm not able to maintain the bot right now. I'm unlikely to return to actively editing Wikipedia - ever. Life is full and Wikipedia isn't high on my list. The key problem here isn't getting the bot working - that's quite easy. The problem is that a troll has been submitting fake requests on behalf of real users. We were looking at requiring oAuth for registered account requests and still allow anonymous for IPs. However, neither DQ nor I have the time to do it. We reached out to a non-profit called 42 Silicon Valley in San Fransisco to see if they'd be willing to take over development and I don't know that status of that. Regarding the bot, something has to be done to prevent the troll from bothering users or submitting fake requests. If anyone has an idea, the code base is freely licensed and available on Github. If anyone needs me to approve their account on UTRS's WMFLabs account, I will do that for any administrator that is also identified to the WMF (because of privacy data). I miss you guys a whole lot, a lot of my negative emotions towards Wikipedia and all of you are long in the past, but I've also moved on with my life. Good luck, let me know how I can help.--v/r - TP 16:13, 5 October 2019 (UTC)
  • @Beeblebrox: I apologize, I did not realize it was the master table that was down, I thought it was just the user talkpage issue. Had I known that, I would have fixed the issue ages ago. It's fixed now. That said, I have given @SQL: basic developer access so he can work on the user talkpage portion which is actively still broken. -- Amanda (aka DQ) 06:17, 6 October 2019 (UTC)

WP 1.0 bot - Error - 503 Service Unavailable

Greetings, This morning I posted a notice here about bot not running. Because I have no idea how to get it re-started, wondering if someone here could help. Regards, JoeHebda (talk) 18:34, 23 October 2019 (UTC)

@JoeHebda: User:WP 1.0 bot is operated by User:Audiodude and User:Kelson - you can try their talk pages. — xaosflux Talk 19:25, 23 October 2019 (UTC)
Hmm wait, you are talking about a problem off-wiki right? for https://tools.wmflabs.org/enwp10/cgi-bin/list2.fcgi? If so, it happens to be those same maintainers. — xaosflux Talk 19:28, 23 October 2019 (UTC)
@Xaosflux: - I did the group ping. It usually takes a long time (days or weeks) for a response. That's why I am asking for help here for a more immediate fix. Previously, I had asked for a bot Restart button & was told it may be years for that to happen. JoeHebda (talk) 20:15, 23 October 2019 (UTC)
@JoeHebda: Unfortunately us Wikipedia editors can not do anything about tools.wmflabs.org services. That service has a "issues" link to phab. You an try emailing the operators if they don't reply to their talk pages. — xaosflux Talk 20:19, 23 October 2019 (UTC)
Bot operator here. I got the message to restart the tool yesterday morning my time and restarted it. Looks like it's working fine now. Sorry for any delays or inconvenience. audiodude (talk) 02:02, 26 October 2019 (UTC)

reFill 2

reFill 2 is a much-used tool for expanding bare references, however its creator/maintainer seems to have retired months ago, and bug reports seem to have gone unresolved for even longer. I know this tool is not a bot, but it was suggested that I inquire here. I was hoping to either recruit one or more editors to take over the tool's maintenance, or at least be referred to another talk page where I'd be likely to find said editors. Thanks.— TAnthonyTalk 16:10, 28 October 2019 (UTC)

@TAnthony: WP:VPT may be better. — xaosflux Talk 16:13, 28 October 2019 (UTC)
That's where I was referred here :/ — TAnthonyTalk 16:16, 28 October 2019 (UTC)
You can try wikitech-lxaosflux Talk 16:38, 28 October 2019 (UTC)
TAnthony, User:Cyberpower678 was looking into the possibility of adopting it. -- GreenC 17:38, 28 October 2019 (UTC)

Bot gone mad?

What is this about? News to me I'd been blocked, & I therefore launched no appeal. Johnbod (talk) 15:12, 11 November 2019 (UTC)

@Johnbod: will follow up at your talk, the "bot part" seems to be functioning correctly - but there may be a different issue. — xaosflux Talk 16:06, 11 November 2019 (UTC)

Useful bot idea?

IABot blue linking to Internet archive books

To a degree I have raised this previously and somewhat poorly in September 2019 at Wikipedia:Village pump (policy)/Archive 154#BOT linking to archive.org possible copyrighted sources. And I learnt a real lots on open library and worldcat out of it and finding sources in books held on archive.org (search foobar site:archive.org). While bot url blue linking seemed to have ceased after that I have observed this has restarted. For example: [4]. This is actually really cool and really useful fantastic stuff. And I use IAbot regularly myself for Linkrot. My concerns revolve around the issues of book loans from Internet Archive and how this is being pointed to with a possible bias as opposed to worldcat where other sources can be located. ( The example I have above has isbn but that wouldn't be present pre-1980s? or so). In summary:

  • At User:InternetArchiveBot IABot is described as 'an advanced bot designed to combat link rot, as well as fix inconsistencies in sources such as improper template usage, or invalid archives'. Blue linking arguably goes beyond the WP:LINKROT brief.
  • I would be more comfortable if the BOT when blue linking also provided a worldcat (olcl=) link or perhaps an open library link (ol=) which gives library locations (worldcat) or also shows possible sellers (open library). I have done this for the example above.[5]

Thankyou.Djm-leighpark (talk) 22:57, 11 November 2019 (UTC)

The diff you linked shows that there was already an ISBN present. The ISBN links to Special:BookSources, which leads to all of the sources you list and many more. – Jonesey95 (talk) 00:56, 12 November 2019 (UTC)
As I said there are cases where no isbn is present, or it may be the BOT operates only when an isbn is present? I support I could try and reverse engineer it or something but it is not reasonable for it to even try and I already mentioned about the isbn in this example. Thankyou. Djm-leighpark (talk) 02:31, 12 November 2019 (UTC)
Seems to be GreenC bot doing this but it may be a called procedure. Is this authorised? Djm-leighpark (talk) 07:33, 14 November 2019 (UTC)
@Djm-leighpark: can you provide some diffs of recent edits that you are concerned with? — xaosflux Talk 12:16, 14 November 2019 (UTC)

Recent diffs from my Watchlist:

These are typically useful, but I have the following concerns:

  1. This functionality does not appear to be shown on the IABot or GreenC bot userpage, or if it may be obfuscated. Rather than 'correcting' a citation this is adding to it.
  2. Just a concern in case this functionality has not been authorised. (Probably I don't know where to look).
  3. Concern if any issues if pointing towards open library loans as opposed to leaving links for commerical buying/finding at worldcat. I may be alone on this. This is partly because this is as far a I know this is an openlibrary/Internet Archive funded initiative which is biased towards pointing at Internet Archive resources away from other resources. I may be alone on this.
  4. Rather than just pointing URLs consideration should be made to also leaving the ol= identifier (and ideally oclc identifier as well). I may be alone on this.
  5. I am concerned the page links may not work on some documents .... I thought I had an example but it was not an Internet Archive resource so there may not be an issue.

Thankyou.Djm-leighpark (talk) 13:52, 14 November 2019 (UTC)

The project has been in the news[6][7][8][9][10] etc. There is approval for adding the books per BRFA and RFC. GreenC and Cyberpower678 are disclosed paid editors of Internet Archive and collaborating on the project. The project is fully supported by the WMF who consider the Internet Archive one of their closest partners, both are non-profit organizations with overlap, we try to collaborate with non-profits vs commercial organizations. Brewster Kahle wants to scan every book cited on Wikipedia so that it can be linked to directly at the page number, this represents 10s of millions of dollars and years of effort, every few weeks they are sending a shipping container full of books to various countries for scanning. -- GreenC 14:34, 14 November 2019 (UTC)

  • Thankyou for that information ... perhaps it would be useful to have a links to if from the bot user page ... perhaps it is already there and I've missed it (if it was on the Did you know I'd probably still miss it). On the positive side I am kind of leveraging some of the side effects of this into articles already. Again thankyou for the information.Djm-leighpark (talk) 14:57, 14 November 2019 (UTC)
I'm not opposed to IABot adding |url= links to cs1|2 citation templates as long as those that require registration are so marked. That appears to be happening. But, when |title= is wholly or partially wikilinked, adding a value to |url= causes URL–wikilink conflict errors. I think that I've discussed this issue with Editor Cyberpower678 though I can't find where I did that. My recollection of that conversation was that Editor Cyberpower678 was not interested in fixing that. I fully admit that I may be mistaken about this impression. It would be good to see it fixed because I grow weary of fixing these damned errors when they should not have been created in the first place.
Trappist the monk (talk) 15:06, 14 November 2019 (UTC)
I don't know about the history but IABot does not do this currently. -- GreenC 15:52, 14 November 2019 (UTC)
Really? Here are a couple of today's IABot/GreenC bot edits that broke the cs1|2 templates:
German occupation of Norway 04:29, 14 November 2019 UTC
Program counter 07:20, 14 November 2019 UTC
If it isn't IABot then it must be GreenC bot that is breaking these templates.
Trappist the monk (talk) 15:59, 14 November 2019 (UTC)
Oh yes, that is a problem. Should be fixed now. It was skipping some instances and not others. -- GreenC 16:32, 14 November 2019 (UTC)
Good. Thanks.
Trappist the monk (talk) 16:34, 14 November 2019 (UTC)
Trappist the monk, I’m not sure what I did to leave that impression, but I take bug reports seriously. —CYBERPOWER (Around) 03:13, 15 November 2019 (UTC)
@Djm-leighpark: maybe I didn't explain my ask well - sorry; can you provide some actual on-wiki revisions diff's of edits you think are problematic? An example of what I'm looking for would be this or this. Looking to see which account made exactly which edit that needs further review. — xaosflux Talk 15:12, 14 November 2019 (UTC)
  • I've tweaked the examples above to show the Diff's better. I don't know there's anything technically wrong with any of them; just concerns as I now know a little better about whats going on. I must confess to being a total Luditte and stay away from the interactive editor ... However I can see cites getting longer and longer and getting more messily intertwined with the prose. WP:LDR solves it but sends interactive editors (mostly everyone but me) bananas. Havard's rubbish on web cites. But I wonder how well the BOT copes with WP:LDR and Havard and citations such as Rennie (1849) in John Rennie the Elder where there are many pages to link? Djm-leighpark (talk) 16:18, 14 November 2019 (UTC)
    • Currently it doesn't. It only adds a link from a specific page mention. Most of the book citations at John Rennie the Elder would also be ignored because they don't use a {{cite book}} with an ISBN, as far as I know. Nemo 21:39, 25 November 2019 (UTC)

User:GreenC bot and edit filters

Is it possible to flag GreenC bot so it's not blocked by edit filters? I'm working to replace a blacklisted domain with archive versions such as:

[http://example.com]

With:

[http://web.archive.org/web/20190901/http://example.com]

The domain was hijacked/usurped by spammers so the old archives have good content. The edit filter is blocking upload of diffs since it is part of the archive URL. -- GreenC 01:11, 23 November 2019 (UTC)

@GreenC: I think the issue is the spam blacklist, since I see only 3 filter hits since 2017 (2 last february, and 1 on a testing filter) DannyS712 (talk) 01:33, 23 November 2019 (UTC)
OK this is correct it says spam blacklist, I was thinking they were the same .. should I post over there? Not sure what the options are if the account can be flagged or remove the listing temporarily. -- GreenC 01:46, 23 November 2019 (UTC)
I used example.com above but the actual domain is blackwell-synergy.com -- GreenC 01:52, 23 November 2019 (UTC)
See phab:T36928 - despite being open and certainly could be useful, it has been ignored by developers for 5 years, maybe at the end of 2020 we can beg for it in the wishlist process again... — xaosflux Talk 02:06, 23 November 2019 (UTC)
Xaosflux, ^ this. —CYBERPOWER (Around) 02:45, 23 November 2019 (UTC)
@Xaosflux: // or, I just spent a few minutes learning the basic code base of the extension and wrote a patch. Now we just need to convince people that it should be added DannyS712 (talk) 02:56, 23 November 2019 (UTC)
+1 that! — xaosflux Talk 03:12, 23 November 2019 (UTC)
I love how the sbl board only has "backlog" lol. — xaosflux Talk 03:13, 23 November 2019 (UTC)
@GreenC: Can a pattern for the URLs that should be allowed be added to MediaWiki:Spam-whitelist? Anomie 14:22, 23 November 2019 (UTC)
Anomie, not really unless you want to blanket archive web.archive.org and other archiving services. But then, that would be open to abuse. You could tighten the regex but it would be absurdly long. The best case would be to let IABot and GreenC bot have a permission that can override the blacklist. —CYBERPOWER (Chat) 14:34, 23 November 2019 (UTC)
@Cyberpower678: Doesn't seem that difficult to me, since archive.org encodes the archive date in the URL and archive.org is the archive site requested here. Let's say we want to allow only archive links before 2017, because the domain was hijacked in early 2017. Something like \bweb\.archive\.org/web/(?:19\d{2}|200\d|201[0-6])\d{10}/http://example\.com/ should do it. Anomie 14:47, 23 November 2019 (UTC)
Anomie, What about the other domains that have been, or will be hijacked? This won't be, hasn't been the only case where the bots have been stopped because of the spam blacklist. —CYBERPOWER (Chat) 14:51, 23 November 2019 (UTC)
On the other hand, what about the concerns raised on the task regarding situations where some editors can (accidentally) bypass the blacklist and others can't? Anomie 14:59, 23 November 2019 (UTC)
My thought was the right be given temporarily for specific jobs then removed when completed. Sort of a temporary privilege one can apply for. Adds overhead but shouldn't be a common request only if doing bot or other automated tasks specific to a blacklisted URL (not incidental). -- GreenC 15:37, 23 November 2019 (UTC)
@Anomie: we could make a group for it, and allow "sysop" to "add/remove", and allow "bot" to "addself/removeself" - sort of like how the 'flood' group is on other projects. — xaosflux Talk 16:48, 23 November 2019 (UTC)

Hi, thanks everyone. I ended up converting to doi.org (or |doi=) and deleting the original URLs, after the run on 550 pages it was left with 14 pages that need an archive URL added, this is pretty good can finish the rest manually. Good to see Danny's patch submitted. -- GreenC 15:37, 23 November 2019 (UTC)

Thank you. That was the recommended course of action indeed. (Such hijacks are one reason we need to get rid of all redundant publisher links as quickly as possible, replacing them with stable resources.) I see there are currently some links left to blackwell-synergy.com/toc/ and similar, but when you're done please note it at m:User:Praxidicae/DOI fix. Nemo 21:44, 25 November 2019 (UTC)

Hatnotes

I was wondering whether there are any bots which make sure hatnotes stay at the top of a page, in case someone adds another template at the top of hatnotes, like a message box.--CaiusSPQR (talk) 21:09, 1 December 2019 (UTC)

Related RfC

Hi. I have opened an RfC at Wikipedia talk:New pages patrol/Reviewers/Redirect autopatrol#RfC on autopatrolling redirects that relates to bots. Watchers may be interested. Thanks, --DannyS712 (talk) 01:38, 2 December 2019 (UTC)

Error: 503 Service Unavailable

Greetings, Yesterday morning I pinged the enwp10 bot operators here, and with no response am wondering if anyone at Noticeboard is able to restart enwp10 bot at tools.wmflabs.org? Or is this a bigger problem than just the bot? Regards, JoeHebda (talk) 14:49, 13 December 2019 (UTC)

@JoeHebda: there is nothing we can do about this on-wiki. You can try also reporting it on phabricator, similar to the the ticket: phab:T207877. — xaosflux Talk 15:37, 13 December 2019 (UTC)
I got a stale file handle error on tool forge yesterday and had to restart PearBOT and last time that happened AnomieBOT has to restart some tasks as well. Perhaps it's related? ‑‑Trialpears (talk) 15:42, 13 December 2019 (UTC)
@Trialpears: there certainly could have been some hiccup there, but like you noticed the fixes have to be done by the tool owners the only thing we can do is ask them to look at their services. — xaosflux Talk 15:51, 13 December 2019 (UTC)
Hello, enwp10 owner here. I've restarted the web tool and it seems to be working now. Thanks, and sorry for the delay. audiodude (talk) 20:52, 14 December 2019 (UTC)

New Pywikibot release 3.0.20200111

(Pywikibot) A new pywikibot release 3.0.20200111 was deployed as gerrit Tag and at pypi. It also was marked as „stable“ release and in with a newly introduced „python2“ tag. The PAWS web shell depends on this „stable“ tag. The „python2“ tag indicates a Python 2 compatible stable version and should be used by Python 2 users.

Among others the changes includes:

All changes are visible in the history file, e.g. here

Best  @xqt 12:41, 17 January 2020 (UTC)

Pywikibot will be dropping Python 2.7 support

I wasn't aware that toolforge will be dropping support for Python 2.7 soonish (see phab:T213287), and only found out when WugBot choked recently. If you run a bot that relies on Python 2.7 now would be a good time to start migrating, and depending on how active some maintainers are, we may see some bots breaking in the future. Wug·a·po·des 22:37, 14 January 2020 (UTC)

Or install in user space (if Python can do that). -- GreenC 22:53, 14 January 2020 (UTC)
(Section header updated) T213287 is not about Toolforge dropping Python 2.7 support; it is about Pywikibot dropping it. Python 2 is no longer supported, and the final 2.7 release is scheduled for April,[11] so Pywikibot is following suit. I imagine that Toolforge will still have Python 2.7 installed until the OS is upgraded to a version without it. — JJMC89(T·C) 04:39, 15 January 2020 (UTC)
You imagine correctly, and according to wikitech:Help:Toolforge/Python the date is 2022, not so far away in the future. A temporary fix before then for PWB/Py2 bots would be to install the last Py2-supporting version of PWB in a local virtualenv. (On a side note, if you are a transition-lazy bot-maintainer, have a look at 2to3, the automatic Py2 to Py3 translator - it probably works way better than you expect.) TigraanClick here to contact me 12:40, 17 January 2020 (UTC)
After collection usage statistics for pwb in phab:T242157 I do not expect a hasty change dropping Python 2. But probably new features will be Py3 only or additional libraries (like enum34) will be required to keep Py2 working with pywikibot framework. Anyway bot users are asked to migrate to Python 3, precisely 3.5+. There is a phab task for any help with py2to3 migration at phab:T242120; don't hesitiate to ask for any support there. Finally there is a gerrit "python2" tag which will always work with Python 2; but further development will stop at some future point for it.  @xqt 13:01, 17 January 2020 (UTC)

Bot question

Is there a bot (or a script) that can be set up to remove image files from sections headings? Images shouldn't really be added to article section headings per MOS:HEAD and MOS:ACCIM, and thanks to JJMC89 at User talk:JJMC89#Images in headings, I've now find a fairly fast way to search for such images, However, there seem to be quite a lot of articles (about a thousand) where this is a problem, including some which have quite a lot of sections with an image added to pretty much each section heading like this. The basic syntax for these files begins with either [[Image: or [[File:, but some articles may have Wikilinks in their headings as as well so just searching for ==[[ might yield some false positives. I'm also not clear on how a space between the heading syntax and the file syntax might effect a bot's search results (e.g. ==[[File: and == [[File:). Manually cleaning these up is a bit time consuming, but it can be done; I'm just wondering if there might be a faster way. I'm not sure whether HEAD and ACCIM would also apply to article talk pages; the rationale seems to be applicable, but talk pages don't really need to be article quality in terms of the MOS, etc.

In a similar vein to the above, I'm wondering if there's also a way for a bot/script to look for citations added to article section headings. Citations, however, would usually come after the heading itself; so, maybe searching for </ref>== could work to help track them down. Citations might require an human editor to follow up to see if there's a way for the citation to be used in the article. Perhaps having about track them down and adding them to a category page would be better than having a bot remove them outright. -- Marchjuly (talk) 05:54, 26 November 2019 (UTC)

As far as I know we have no scripts or bots for either thing. A dedicated bot might be desirable, although you'd want to make sure there are no justified exceptions beforehand - maybe ask some of the editors who added such links or files? Jo-Jo Eumerus (talk) 06:59, 26 November 2019 (UTC)
I've encountered citations in article headings, and usually move them to an introductory sentence in the section. But I have encountered a few, especially for lists, where this would be exceptionally awkward. Not impossible, awkward. DGG ( talk ) 18:05, 24 December 2019 (UTC)
re cites, something like Source: {{cite web...}} in the first or last line of the section is easy to automate. Regarding exceptions, if 95% should be converted, the other 5% should not stop the work. There can be flags like {{cbignore}} to tell bots not to convert those exceptions. If bot ops are required to manually check each one, it will never get done, and the damage is greater when 95% never get converted. I have no opinion about images in headings, though I think your example looks pretty cool and wouldn't be in a rush to change it because of a generic MOS guideline. Cites are more clear cut. But that is only IMO. -- GreenC 23:06, 14 January 2020 (UTC)

I've got no opinion about whether this is desirable or not, but if it is done make sure to ignore links to files, i.e. [[:Image: or [[:File:. Thryduulf (talk) 20:49, 1 February 2020 (UTC)

ListeriaBot behaving poorly

ListeriaBot (t · c · del · cross-wiki · SUL · edit counter · pages created (xtools • sigma· non-automated edits · BLP edits · undos · rollbacks · reviews · logs (blocks • rights • moves) · rfar · spi) (assign permissions)(acc · ap · fm · mms · npr · pm · pcr · rb · te)

I did leave a message on the operator's page User talk:Magnus Manske#zzzother, but feel that the bot may need disabled until it's fixed instead. Jerod Lycett (talk) 19:23, 1 February 2020 (UTC)

  • I've blocked the bot, as I see the issue and couldn't find out a way to fix it otherwise. Jo-Jo Eumerus (talk) 19:38, 1 February 2020 (UTC)
    @Jo-Jo Eumerus: was this issue occuring on multiple pages? If not a partial block of the one page could suffice while waiting for the operator. — xaosflux Talk 19:40, 1 February 2020 (UTC)
    (Non-botman comment) In the spirit of an fyi, you may be waiting sometime; the botop hasn't edited for ~5 months. ——SN54129 19:47, 1 February 2020 (UTC)
    @Serial Number 54129: they have globally, just not locally - hopefully will respond to pings and talk notice. — xaosflux Talk 19:56, 1 February 2020 (UTC)
    Indeed, but what is it they say about hope not paying bills...? :) ——SN54129 19:59, 1 February 2020 (UTC)
    Replying to xaos' comment, while I would normally agree that a partial block is what's necessary here since it's (currently) only one known page, I think we should look at that page first and see what might be causing the issue; if it's just something on the page, then remove it and make sure the botop knows. If it's a glitch with the bot itself, then it could pop up anywhere. Primefac (talk) 20:05, 1 February 2020 (UTC)
    I suspect it might be something at Wikidata, but I couldn't tell where to look from the page code. Regarding using partial blocks, they are a new concept and I am a little wary of using them in lieu of a regular block when I don't know the full extent of the problem. Jo-Jo Eumerus (talk) 20:43, 1 February 2020 (UTC)
    Agree with you there Primefac prior to pblocks, I would have tried to just {{nobots}} the page, but I saw that the bot doesn't respect nobots. From much lower looks like this was just a case of the bot working out of scope because another editor confused it - if that becomes a regular issue having the bot improve its input validation would be a good fix bot-side. — xaosflux Talk 00:25, 2 February 2020 (UTC)
    @Xaosflux: I discovered this while doing WP:WCW work, so I don't know. That's why I originally brought it up to a BAG member, as I would not know where to begin the investigation other than just going through contribs and making a judgement. There is a supposed non-consensus of its use in mainspace for this purpose though: {{Wikidata list}}. Jerod Lycett (talk) 21:02, 1 February 2020 (UTC)
  • I was under the impression that ListeriaBot was allowed only to edit outside of mainspace. Did I miss something? --Izno (talk) 21:10, 1 February 2020 (UTC)
    I believe you have the correct impression; the page in question was only created two weeks ago (14 Jan) by TiagoLubiana. After removal of the bot call by Diannaa, Tiago re-added the code, but as near as I can tell the bot has never updated the page properly. Primefac (talk) 21:50, 1 February 2020 (UTC)
    It is indeed not supposed to, but the only enforcement of this is an error message from {{Wikidata list }} (There is no consensus to use Template:Wikidata list in articles.) It seems to me like a more sensible option would implementing a partial block on mainspace edits. FYI List of OBO Foundry ontologies is the only current use in mainspace and previous ones have been removed semi-regularly, mainly by UnitedStatesian. ‑‑Trialpears (talk) 21:50, 1 February 2020 (UTC)
  • Okay, so apparently it's a copy/paste issue from the template documentation. That being said, {{Wikidata list}} has no consensus to be used in the article space so I have removed it from the article and will be shortly removing the ability for it to be used in the mainspace altogether. As this seems to be the major issue, I've also unblocked the bot. Primefac (talk) 21:54, 1 February 2020 (UTC)
  • Sorry about the trouble this caused. I saw there was no consensus in the warning, but I failed to appreciated that it meant that it should not be used, at least not yet. Maybe a different wording in the warning could make it clearer for less experienced editors to avoid these mishaps. TiagoLubiana (talk) 19:22, 3 February 2020 (UTC)

New Pywikibot release 3.0.20200306

(Pywikibot) A new pywikibot release 3.0.20200206 was deployed as gerrit Tag and at pypi. It also was marked as „stable“ release and the „python2“ tag. The PAWS web shell depends on this „stable“ tag. The „python2“ tag indicates a Python 2 compatible stable version and should be used by Python 2 users.

Among others the changes includes:

  • The Site method media_wikimessages() may return MediaWiki messages for foreign language codes.
  • For ISBN related tasks the stdnum package is required. (Task 132919, Task 144288, Task 241141)
  • The modules weblib and botirc were removed (Task 85001, Task 212632), also some outdated methods.

The following code cleanup changes are announced for the next release:

  • Test Site should be invokend by Site('test', 'wikipedia') instead of the deprecated Site('test', 'test').
  • MediaWiki versions prior to (LTS) 1.19 will no loner supported (Task 245350).
  • The submodul „compat“, which was introduced to facilitate Pywikibot „compat“ to „core“ migration, will be deleted.
  • For Python 2 the package ipaddress is mandatory (Task 243171); the submodule tools.ip will be deleted.
  • Python 2.6 supporting backports.py will be deleted. (Task 244664)

All changes are visible in the history file, e.g. here

Best  @xqt 12:41, 17 January 2020 (UTC)

2015 thesis on Bots and Wikipedia

This morning, I stumbled upon

  • Clément, Maxime (2015). Collaboration et automatisation dans le transfert deconnaissances: Perception des agents logiciels parles contributeurs de Wikipédia (PDF) (M.Sc. thesis) (in French). Québec, Canada: Université Laval.

It's a truly fascinating read if you're into bot history and social norms related to bots on Wikipedia. It mostly covers 2009–2014ish, and is in French, but the appendix also contains a copy of an English summary/prior research published in Computers in Human Behavior if you don't speak French. Those who have access to that journal can find it directly at

They get a couple of minor details wrong (for instance, they claim I'm an admin), but by far and large it's on the money. Headbomb {t · c · p · b} 16:11, 24 February 2020 (UTC)

Limits on custom signatures

Please see mw:New requirements for user signatures. One of the goals is to make it easier for bots/scripts/tools to recognize custom signatures (e.g., by requiring that they all contain a link). Please share information or examples over there. Whatamidoing (WMF) (talk) 18:16, 4 March 2020 (UTC)

Commented, but holy hell does that VE hot garbage make it impossible to make any sort of sensible comment or edit them after the fact. Headbomb {t · c · p · b} 18:40, 4 March 2020 (UTC)
Hey, don't blame VE for that. That's pure, unmitigated LiquidThreads Flow Structured Discussions --AntiCompositeNumber (talk) 19:14, 4 March 2020 (UTC)

New Pywikibot release 3.0.20200306

(Pywikibot) A new pywikibot release 3.0.20200206 was deployed as gerrit Tag and at pypi. It also was marked as „stable“ release and the „python2“ tag. The PAWS web shell depends on this „stable“ tag. The „python2“ tag indicates a Python 2 compatible stable version and should be used by Python 2 users.

Among others the changes includes:

  • The Site method media_wikimessages() may return MediaWiki messages for foreign language codes.
  • For ISBN related tasks the stdnum package is required. (Task 132919, Task 144288, Task 241141)
  • The modules weblib and botirc were removed (Task 85001, Task 212632), also some outdated methods.

The following code cleanup changes are announced for the next release:

  • Test Site should be invokend by Site('test', 'wikipedia') instead of the deprecated Site('test', 'test').
  • MediaWiki versions prior to (LTS) 1.19 will no loner supported (Task 245350).
  • The submodul „compat“, which was introduced to facilitate Pywikibot „compat“ to „core“ migration, will be deleted.
  • For Python 2 the package ipaddress is mandatory (Task 243171); the submodule tools.ip will be deleted.
  • Python 2.6 supporting backports.py will be deleted. (Task 244664)

All changes are visible in the history file, e.g. here

Best  @xqt 07:33, 6 March 2020 (UTC)

Discussion at Wikipedia talk:Bot policy#Bot expirations?

  You are invited to join the discussion at Wikipedia talk:Bot policy#Bot expirations?. Sdkb (talk) 08:25, 10 March 2020 (UTC) Sdkb (talk) 08:25, 10 March 2020 (UTC)

Hyperactive bots

I've said my piece at User talk:Mifter#This is ridiculous. Narky Blert (talk) 23:23, 31 March 2018 (UTC)

Question regarding wikiproject open tasks bots

On the tea house I asked if there was a bot that could organize a wikiproject's backlog into an open tasks page. They directed me here. Can you help me? — Preceding unsigned comment added by Kaiser Kitkat (talkcontribs) 05:15, 17 December 2019 (UTC)

I don't know of any such bot, but perhaps they can tell you at WP:BOTREQ? Jo-Jo Eumerus (talk) 09:42, 17 December 2019 (UTC)
@Kaiser Kitkat: Try this external tool: find your WikiProject and then link to that page. --Izno (talk) 18:01, 17 December 2019 (UTC)

New Pywikibot release 3.0.20200326

(Pywikibot) A new pywikibot release 3.0.20200226 was deployed as gerrit Tag and at pypi. It also was marked as „stable“ release and the „python2“ tag. The PAWS web shell depends on this „stable“ tag. The „python2“ tag indicates a Python 2 compatible stable version and should be used by Python 2 users.

Among others the changes includes:


The following code cleanup changes are announced for one of the next releases:

  • Test Site should be invokend by Site('test', 'wikipedia') instead of the deprecated Site('test', 'test').
  • MediaWiki versions prior to (LTS) 1.19 will no loner supported (Task 245350).
  • For Python 2 the package ipaddress is mandatory (Task 243171); the submodule tools.ip will be deleted.
  • Featured articles interwiki link related functions will be desupported.

All changes are visible in the history file, e.g. here

Best  @xqt 12:04, 27 March 2020 (UTC)

Re-examination of ListeriaBot

The bot code has been updated. I'll leave the bottom section open for further comment but I think we're pretty much wrapped up here. Primefac (talk) 22:59, 18 April 2020 (UTC)
The following discussion is closed. Please do not modify it. Subsequent comments should be made on the appropriate discussion page. No further edits should be made to this discussion.

I think that this Bot's BRFA was defective and that this deficiency has been shown repeatedly. I first encountered the bot when it was used by an editor to create a bunch of list articles in mainspace (see entries which start List of museums). This bot did not seem to be approved to operate in mainspace which seems to have since been confirmed in this BOTN discussion earlier this year. Today there is evidence of repeated problems with the bot and non-free images. From today: Bot operator's talk page &AN earlier discussions: User talk:ListeriaBot/Archive 1#Adding non-free images, Wikipedia:Village pump (technical)/Archive 154#ListeriaBot adding non-free images to Wikipedia namespace page, Wikipedia:Village pump (technical)/Archive 158#Listeria bot and non-free images, User talk:Magnus Manske/Archive 6#Non-free images being added by Lysteria bot and Wikipedia:Village pump (technical)/Archive 159#Lysteria bot and shadowing. This bot, even during the approval process, was operated outside of policies and the operator does not seem to be responsive when concerns are raised. I ask for two actions:

  1. The bot be partially blocked from article space so that it may not edit there
  2. Changes be made to the bot, either by the current operator or by a new operator, to ensure that future non-free image problems do not occur and to the extent that they do that changes are made in a responsive manner

I recognize that this bot is important to the running of many behind the scenes tasks and do not want to disrupt those; it's only because of that that I am not asking for the bot's approval to be revoked until action 2 can be completed. However, I believe that this bot has, since before its approval, been operating outside of the bot policy and that this continues to the present time and in disruptive ways. Best, Barkeep49 (talk) 14:34, 11 April 2020 (UTC)

Partial block was floated last time, this time I'll implement. Artice-space only, but at the moment that seems to be the main soncern. Should stop all the bickering at {{Wikidata list}} as well. Primefac (talk) 15:02, 11 April 2020 (UTC)
Thanks Primefac, but I do wish to note that today's issue around non-free images was caused in userspace. Hence action request 2. Best, Barkeep49 (talk) 15:06, 11 April 2020 (UTC)
Fair enough, though I'd like to hear other input before blocking wholesale again (unless you think only allowing in WP-space until this is sorted isn't too unreasonable).
Actually, that could still be an issue in WP-space. Do we know if things are properly set up on places like WiR? Primefac (talk) 15:12, 11 April 2020 (UTC)
  • If there are issues with the bot, a complete block makes the most sense until we can ascertain what needs to be fixed. Note I don’t think it’d be considered wheel warring because a consensus is emerging that it was a bad unblock. TonyBallioni (talk) 15:55, 11 April 2020 (UTC)
  • As approved, this bot should operate only within a handful of talk pages. Editing outside of that space without approval upsets the careful balance of community approval and oversight required for bots, a balance that was struck after a great deal of bot-related disruptions in the past. Good block, and there would be no issue extending this to a full block or to all of the namespaces other than user/user talk. Best, Kevin (alt of L235 · t · c) 16:19, 11 April 2020 (UTC)
    Do you mean namespaces other than WP/WPT? Seems like WiR is the main user of this bot's functionality. Primefac (talk) 16:41, 11 April 2020 (UTC)
  • Bot is working just fine in many different wikis including this one. If people make a list that include images, the bot will only use images from Commons. The English Wikipedia shouldn't have unfree images with the same name and if it happens than an easy fix is to rename the local image here (just append "(non-free)" to the name). I don't expect any bot code will be updated just for this edge case. Probably best if someone just publishes a collision report here for images that have the same name here and on Commons and the local one is not free. Bonus points for including if the image is used on a wikidata item. People who care a lot about non-free images can keep an eye on this report and take care of any collisions. Multichill (talk) 16:27, 11 April 2020 (UTC)
    • @Multichill: Speaking of the bot being used on other wikis, it hasn't been approved for global use and its use on small wikis has thus drowned out the recent changes queue (which is limited to the last 1000 edits) for patrollers trying to keep those wikis clear of spam and vandalism. And saying that you don't expect any bot code will be updated just for this edge case seems pretty backwards; bots ultimately exist to relieve Wikipedians of doing tasks that we'd otherwise have to do, not create greater problems for us. I completely agree that in a perfect world we'd have eliminated all the enwiki/commons conflicts long, long ago, but unless you're volunteering, I think it'd be a good start to comply with the copyright laws and copyright policy. Kevin (alt of L235 · t · c) 16:38, 11 April 2020 (UTC)
      And I am sure you know that Multichill was doing this single-handedly for a long time and stopped doing this in I believe 2012 or 2013 when he was told his work is not appreciated, and nobody in the community stood up for him.--Ymblanter (talk) 17:44, 11 April 2020 (UTC)
      @Ymblanter: I'm sorry, I didn't know that. I was unnecessarily cold in my message and I regret that, Multichill. My core point is that dealing with these cases is the responsibility of the bot operator whose bot is adding non-free images in contravention of policy. Best, Kevin (alt of L235 · t · c) 17:48, 11 April 2020 (UTC)
  • I concur with Multichill. I might even go one step further. Force all nonfree images to have a tag to that effect included in the filename. But that would need to be discussed elsewhere. Anyway the suggestion of Multichill would solve the Listeriabot problem from a practical point of view. Agathoclea (talk) 16:45, 11 April 2020 (UTC)
    • As Kevin says, bots should be reducing the burden on other editors, not increasing it. It would be better to update the bot once to not violate policy on the use of non-free images, rather than requiring editors to continuously watch for images with the same name but different licensing. The bot doesn't care about checking one extra thing for the images it uses, it takes a few CPU cycles. If the bot isn't able to operate without wasting volunteer time to make sure it doesn't violate copyright, then the bot shouldn't be operating. ST47 (talk) 16:46, 11 April 2020 (UTC)
  • The bot is used by a number of users and project pages. This alone is enough to prove that there is wide consensus of its usefulness. Nemo 20:59, 11 April 2020 (UTC)
  • I believe completly blocking ListeriaBot should be a last resort. It is used quite a lot and people clearly find it useful. I recognize that it probably will be difficult to prevent the bot from adding non-free images since the botop is inactive, but there are still ways to remedy the issue. My first idea would be making JJMC89 bot add {{bots|deny=ListeriaBot}} to pages with {{Wikidata list}} if it has to remove non-free content and logging it somewhere. If that was done ListeriaBot could continue as usual while avoiding copyright concerns. Pinging JJMC89. ‑‑Trialpears (talk) 21:19, 11 April 2020 (UTC)
  • Unless or until the bot is configured to comply with copyright laws of the United States and our local policies, it should not be approved to run. That is the bare minimum we should expect of a responsible bot operator. If the operator cannot or will not do so, the bot should have its flag revoked and blocked from all editing until the threat to the project is resolved. Wug·a·po·des 21:22, 11 April 2020 (UTC)
  • Unblock and restore approval. The bot performs a very useful function (building Wikipedia-space lists of articles linked on other Wikipedias but missing from this one, in aid of people looking for articles in need of creation) and the issue causing the block is minor, incidental, and caused by other problems unrelated to the bot (we should not have non-free images that shadow free ones and it is unsurprising that when we do it will confuse both the bot and human editors). If possible, the bot should be made to recognize this situation (I am not sure how difficult this task is and how much extra load it will put on the bot — currently it gets its information from a single SparQL query and this would likely mean that it would have to perform hundreds of additional queries per page to check the status of images) but I see that as a long-range enhancement to aim for and not a reason for blocking. Any images identified as causing problems for the bot can be moved to better names, fixing the problem without any need to resort to a block. —David Eppstein (talk) 23:05, 11 April 2020 (UTC)
  • It's impossible to write a bot whose edits are guaranteed to comply with copyright laws, just as no editor can make that guarantee for their edits. If you add a free image from Commons to an article, you have no way of guaranteeing that somebody will not then upload a different non-free image locally with the same filename. The image in the article now breaches fair use, so who is responsible for the breach? Is it the editor who originally added the free image from Commons? or the editor who uploaded the non-free image locally? You solve those problems, as well as the one the bot experiences, by making sure that non-free images held locally have a designation in the filename – e.g. (NF) – . That would need a greater consensus than we can create here, but there would be huge incidental benefits for re-users of our content who would immediately know what part of our content isn't available under CC-BY-SA. --RexxS (talk) 23:21, 11 April 2020 (UTC)
    RexxS, as it happens that is discussed at Wikipedia talk:Non-free content/Archive 70#Requiring non-free content to indicate that in their filenames. ‑‑Trialpears (talk) 23:29, 11 April 2020 (UTC)
    RexxS, “breaches fair use” You mean breaches our non-free content criteria. Fair use is a copyright violation defense, not a rigid set of laws. —TheDJ (talkcontribs) 08:50, 12 April 2020 (UTC)
    @RexxS: As explained in detail at the other discussion renaming the files would not accurately inform people of the copyright status of any image (not all free images are cc-by-sa, and even the ones that are are not all the same version; non-free files may be free for a re-user depending on their commercial status, geographical location and whether they need to allow for the possibility of derivatives). It would however actively mislead reusers into thinking the filename was all they need to know. Thryduulf (talk) 10:06, 12 April 2020 (UTC)
    @TheDJ: I'm pretty certain everyone will be aware of what would be breached despite my shorthand.
    @Thryduulf: I can see you point, but I'm not sure I agree with it. To all intents and purposes, a file hosted on Commons is usable anywhere by third parties given that they meet the license conditions. The minutiae of the licence does not change it from a free-to-use file into a non-free-to-use file. A recognisable token in the filename of a non-free-to-use file hosted on enwiki would at least indicate to which of two groups they belong, and a third-party would be able to differentiate immediately which content is not available from content that may be available upon meeting the conditions of its licence. --RexxS (talk) 16:48, 12 April 2020 (UTC)
  • I'm still amazed that the whole problem here is that an En.wp and Commons users/sysops cannot get together to delete or move either one of the colliding files, which would clearly be the least disruptive, most effective and easiest solution to a relatively rare problem and something that should happen, wether the file is in use or not honestly. but no if it aint my boathouse ill just block the bot. —TheDJ (talkcontribs) 09:10, 12 April 2020 (UTC)
    @TheDJ: Actually, that is the most common solution to a not-so-rare problem. But sometimes a deletion (or two deletions) are a better move and in that case the problem lingers for a while. More importantly, someone has to notice the shadowing situation, first, and GreenC bot doesn't do so instantaneously ... in a way it's a race condition between Listeriabot and GreenC bot (+the admins who process Category:Wikipedia files that shadow a file on Wikimedia Commons) Jo-Jo Eumerus (talk) 09:21, 12 April 2020 (UTC)
  • A bot should never be edit warring with another bot. When one of those bots is enforcing enwiki policy, and the other is violating it, it is clear where the problem is. This bot should remain blocked until the maintainer corrects the issue. Bot policy requires that maintainers be responsive to issues that may arise. I don't fault the operator for allowing this to happen the first time, as there are indeed a lot of edge cases in our little encyclopedia. However, once it has been discovered, the problem needs to be fixed, or the BAG should withdraw approval from this bot. ST47 (talk) 15:49, 12 April 2020 (UTC)
    @ST47: In this case, it would actually be better if the Listeria edits were allowed to stand, and generic NFC removal bots were stood down on Listeria pages, because the presence of a NFC file on a Listeria page is a good, very specific, really easy to detect tell-tale for a filename with the file-shadowing problem, that is otherwise rather hard to detect, and which is best dealt with by specific action to rename one or more of the filenames and resolve the collision, which surfacing the problem enables. Here, removing the tell-tale is not helpful. Jheald (talk) 11:46, 13 April 2020 (UTC)
    @Jheald: How do you anticipate the issue would get flagged if NFC bots are not involved? Nikkimaria (talk) 00:15, 14 April 2020 (UTC)
    @Nikkimaria: The issue that causes the problem here is very specific. It is not the usual generic NFC problem of somebody having purposely added an NFC image to an article or list where it shouldn't be, that the bots typically deal with. Instead, ListeriaBot is trying to show a completely legitimate Commons image, but the issue is that there is a local NFC image which unfortunately has the same name. GreenC bot is already on the case, adding such images to Category:Wikipedia files that shadow a file on Wikimedia Commons, for humans to decide the best new filenames to resolve the issue. (Which then ensures it cannot occur again for that file). But if this is not enough, it would be straightforward to run a specific SQL query to find images that are on Listeria pages and also in Category:All non-free media, which could also be used to flag the files into the category for them to get the attention they need. Jheald (talk) 14:32, 14 April 2020 (UTC)
  • I concur with User:Multichill, and I'm actually appalled by the attitude of this community towards Wikidata and its tools, since it goes against every and all principles we have about cooperation, mutuality and good faith. I've been silent this whole time, since Wikidata's very inception, so I beg you to forgive me this burst of annoyance. --Sannita - not just another it.wiki sysop 13:48, 14 April 2020 (UTC)

let us consider what Listeria lists stands for

This discussion has been closed. Please do not modify it.
The following discussion has been closed. Please do not modify it.

I am quite happy to have the bot flag of ListeriaBot considered but lets do it properly. When you really, really, really want to go this way. Let us consider what Listeria lists stands, its Wikipedia alternatives. Disambiguation, red links blue links and black links. Most importantly how we share in the sum of all knowledge and how English Wikipedia can play a vital role in it. Let's include images linked to people, the role Commons can play in this. How English Wikipedia can keep its non free images and inform on the images that it keeps in this way. Let this conversation not be about an edge case.

By all means discuss a bot flag for ListeriaBot but do present a serious alternative. Serious not in intentions but serious in that it will serve us in a way that is imho missing in what Wikipedia stands for in its dismissal of collaboration on multi project and multi language levels. Without a reasonable outcome branding us all as Wikipedia is mostly painful because of what we could stand for together. Thanks, GerardM (talk) 18:03, 11 April 2020 (UTC)

GerardM, I don't say this often, but you just wrote a lot of text without saying anything. What are you trying to say? Primefac (talk) 18:12, 11 April 2020 (UTC)
Primefac maybe you understand my blogpost better.. For me this episode is another reason why I do not want to be associated with Wikipedia. What is it with you people? Thanks, GerardM (talk) 08:44, 12 April 2020 (UTC)
@GerardM: Your blog post also doesn't say anything that is useful to this situation. Everybody agrees that the core job this bot does is useful, so simply repeating that it is useful and explaining why it is useful adds nothing of value. The issue is that the bot will not be unblocked unless and until it is reprogrammed so that it doesn't edit outside of what it has authorisation to do. This is exactly how every other bot that has bugs is treated - if the operator does not stop it then it is blocked. Listiera bot is not special, it is being held to same standards required of every bot that operates on the English Wikipedia. Thryduulf (talk) 12:15, 12 April 2020 (UTC)
@Thryduulf: at issue is that English Wikipedia has pictures that are not free. These pictures only show on English Wikipedia. What we can do is include a link to the Wikidata item on Commons and only show pictures marked that way. It has additional benefits because those pictures will be easier to find including in "other" languages like Russian, Kannada, Comanche. It says so in the blogpost.
Also do you not think that this is where English Wikipedia untouchables take a position where the penalty to our community is excessive. What are you guys thinking??
Also, when are we going to discuss and act on issues with quality on English Wikipedia. At least 4% of list entries in English Wikipedia are erroneous and the quality of maintenance by hand is substantially less than Listeria maintained lists. Together we will do a better job. Thanks, GerardM (talk) 12:31, 12 April 2020 (UTC)
How is any of that relevant to this discussion? It doesn't matter what else Listeria bot does, can or could do, it will not be unblocked unless and until it is reprogrammed so that it doesn't do anything it does not have community consensus to do. Thryduulf (talk) 12:37, 12 April 2020 (UTC)
How is it that you are only willing to consider a perceived wrong and not willing to consider the meat of the matter, quality? When you insist on branding ListeriaBot as ill behaved because of a corner case, a four percent improvement fixing the false friends in Wikipedia lists is quite substantial and should have your attention. Thanks, GerardM (talk) 13:10, 12 April 2020 (UTC)
Simply put, the harm done by one avoidable copyright violation outweighs all the other good things the bot does. Human editors that behave in the way this bot does (knowingly or recklessly introducing copyright problems, editing otherwise than in accordance with consensus, ignoring editing restrictions) are regularly blocked. The good they do elsewhere is not regarded as justifying the harm they cause. Listeriabot is not special and there is no reason to treat it differently than any other bot or editor would be treated. Thryduulf (talk) 14:23, 12 April 2020 (UTC)
Simply put, substantiate your claims. You argument is about a corner case where a procedure exists to overcome issues. We are talking about issues at Commons, a project that is more stern in its maintenance of copyright then English Wikipedia is. On the other hand, an error rate of 4% of all English Wikipedia lists is substantial, there is no mitigation the case is well argued. In addition Magnus has demonstrated that Listeria lists are better maintained than the average manually maintained list. Now consensus is something to hide behind when arguments fail you. Such behaviour is not special and harms our cause. Is quality of English Wikipedia a consideration at all? Thanks, GerardM (talk) 14:47, 12 April 2020 (UTC)

Accidental usage of non-free images

The issue raised here seems to be centered on the accidental use of non-free images via the edits of this bot. I've posted a proposal at Wikipedia_talk:Non-free_content#Requiring_non-free_content_to_indicate_that_in_their_filenames that would resolve this without requiring changes to this bot. Please comment there! Thanks. Mike Peel (talk) 18:45, 11 April 2020 (UTC)

You do realize that until that RFC concludes you're essentially saying you're okay with situations like this, where two bots mindlessly edit war for no reason other than the fact that one bot is not "behaving" correctly? Primefac (talk) 20:33, 11 April 2020 (UTC)
@Primefac: I'm suggesting a broader solution that would fix this issue while simultaneously avoiding any similar situation arising in the future. I could implement it tomorrow if there's consensus to do so. Thanks. Mike Peel (talk) 20:43, 11 April 2020 (UTC)
Then by all means, please fix the bot so that it doesn't add non-free files to non-articles. Primefac (talk) 21:13, 11 April 2020 (UTC)
@Primefac: I'm suggesting fixing enwp so the bot wouldn't cause problems. Mike Peel (talk) 21:15, 11 April 2020 (UTC)
While I realize it's only been a few hours, there is currently no consensus to implement your plan. I am genuinely curious, why is there such reluctance to make this change? Primefac (talk) 21:18, 11 April 2020 (UTC)
It is not entirely obvious to me that this change is an easy one to make. How many additional server queries would be required to detect commons-images-shadowed-by-non-free-local-images, compared to the queries the bot already makes to do its work, how much extra load would be caused by the queries, and what information from the server is available for the bot to determine whether a local image that shadows a commons image is unfree? Have you actually done this analysis? Or do you just assume that because you can do it as a human by clicking on and using your human ability at natural languages to read a few pages that it will be equally easy for a bot? I don't actually know that it's difficult, but I don't know that it's easy, and I don't see convincing evidence that you do either. —David Eppstein (talk) 23:47, 11 April 2020 (UTC)
The bot would just need to check whether the image is in Category:All non-free media. This can easily be done using the categorymembers API, or if the bot runs on toolforge, using the database mirrors. ST47 (talk) 23:53, 11 April 2020 (UTC)
@David Eppstein and ST47: ST47 beat me to it. Glancing at the source, the bot makes ample use of SQL queries. A patch for this issue would be no more than 10 lines of code (perhaps I'll create a pull request...). --Mdaniels5757 (talk) 00:06, 12 April 2020 (UTC)
@Mdaniels5757: I suspect you may find it will take a bit more than that, and be rather more server-intensive than you seem to think, to deal with a problem of shadowed file-names that shouldn't exist anyway. Jheald (talk) 12:41, 12 April 2020 (UTC)
I don't think that the bot has to detect shadowed files; it needs just to detect whether the enwiki filepage is non-free and that can be done by checking for the All non-free media categories category or the {{Non-free media}} template. It's not really reasonable to expect a bot (or even a human) to detect incorrectly licenced files on either project; I'd file these under GIGO and let editors take care of them as they come across them. Jo-Jo Eumerus (talk) 08:52, 13 April 2020 (UTC)
@Jo-Jo Eumerus: Sure. I don't disagree.
But (as I suggest in more detail in the section below) what will add to the complexity of the bot, and the load on the servers, is having to detect when it is adding files at all, and then having to run a SQL request to check each one of them -- and having to do so in a way that is specific to en-wiki distinct from any of the other 70 wikis the code is serving, fracturing what otherwise is relatively simple single unified code.
Moreover, again as argued below, the most relevant point I think is it may actually be beneficial that the bot is surfacing files with this shadowing issue, that (once the edit is made) can then be rather easily picked up by an SQL intersection of images in the non-free category and images on Listeria pages, so that the underlying filename problem can then be identified and fixed, rather than it continuing to fester under the surface. Jheald (talk) 09:18, 13 April 2020 (UTC)
But again, why does it need to display the image for that purpose? As repeated ad nauseam through all three discussions, the fix isn't some kind of lengthy recoding; it's literally the addition of a single colon so the relevant section of the reports generates [[:File:Filename.jpg]] instead of the current [[File:Filename.jpg]]. ‑ Iridescent 09:58, 13 April 2020 (UTC)
(ec) @Iridescent: As I understand it, that is not the fix that Jo-Jo was suggesting. He was suggesting making that change only for the files that are local non-free ones. Which is a much larger coding job, with the issues noted above.
What you suggest would remove display of all the images in all the Listeria lists - all 2500 of them on enwiki - to deal with a transient incidental issue that affects at most only a handful of images at a time out of all of those lists, and in talk-space not main space. It means for example, in a Listeria blue-link/red-link list of paintings by an artist, Listeria would no longer show what the paintings were, which can be hugely useful for the identification of paintings that may go by various different names, or for the identification of duplicates; it makes it hard to identify paintings that may have substandard images, or eg to prioritise article creation for paintings that have really good images. So I do think that the blanket turning off of all images on Listeria pages is a step to be avoided if we possibly can.
Also, it means that the mechanism described above, of being able to identify shadowed images by a simple SQL query through their being used on a Listeria page would fail, because they would no longer be being used on a Listeria page. Jheald (talk) 10:21, 13 April 2020 (UTC)
Mind you, we already have a bot that flags shadowed files (GreenC bot), so I wouldn't consider another bot doing the same as a large advantage. Jo-Jo Eumerus (talk) 10:17, 13 April 2020 (UTC)
If it's doing such a good job, then why do we have this problem? Jheald (talk) 11:36, 13 April 2020 (UTC)
Probably because ShadowsCommons situations were not really time sensitive matters until ListeriaBot began getting confused by them. I am also not sure how the latter helps "surfacing" the shadowed files. Also, I think that I and Iridescent are approaching different ends of Listeriabot in an attempt to resolve this issue (I am looking at the input, Iridescent is proposing a change to the output) Jo-Jo Eumerus (talk) 12:36, 13 April 2020 (UTC)
@Jo-Jo Eumerus: Thinking about this a bit more, from a strictly coding point of view, the cleanest approach (if there really is a problem here that needs to be dealt with) might be to let the existing code make its edit, wait a few seconds for the SQL tables to update, then run an extra script to make an SQL query to see whether any of the images now on the page were also in the non-free images category, and if so then edit the page to insert a colon to turn the displayed file into a link, and also make sure that the file was categorised in Category:Wikipedia files that shadow a file on Wikimedia Commons.
This would have the advantage of requiring only the most minimal changes to the existing main Listeria script, that runs across 71 wikis; and also making sure that the identified files were in the category for fixing. But it would come at the cost of an extra edit, and of the files still appearing where they shouldn't for a few seconds. Do you think such an approach would be (a) workable, and might be (b) acceptable? Jheald (talk) 13:59, 13 April 2020 (UTC)

Status of bot

  • I've restored the full block based on consensus here and at AN that the bot was operating outside policies in multiple areas and that the initial block was good, and that the unblock did not coincide with our normal practices on bot unblocks. The only objection raised at AN was procedural (wait 24 hours), and since that thread has been closed and this one is dealing more with the technical issues, I felt that it was best to act on the community consensus while there was still an obvious link to the discussion. I feel that this fulfills the requirements of WP:WHEEL that discussion occur first and consensus be reached. If there is consensus that I have acted out of line and that I am misinterpreting policy, another administrator is free to reverse. TonyBallioni (talk) 21:31, 11 April 2020 (UTC)
  • The bot indeed does not have approval to violate our fair use policy and should remain blocked on enwiki until the copyright issues are solved. That it is deemed useful does not make copyright policy optional. Headbomb {t · c · p · b} 13:24, 12 April 2020 (UTC)

What happens to Wikidata updates?

Apologies if I've missed this, but what's going to happen with the Wikidata list updates from now on (apart from them not happening)? Thanks. Lugnuts Fire Walk with Me 07:27, 12 April 2020 (UTC)

As it is, they happen. Just not on English Wikipedia. Thanks, GerardM (talk) 08:39, 12 April 2020 (UTC)
Lugnuts, which Wikidata lists specifically, can you give examples ? There are many possible interpretations of your phrasing, and more exact answers require more exact questions. —TheDJ (talkcontribs) 08:46, 12 April 2020 (UTC)
@TheDJ:, ones such as WP:WIROLY. This would be updated every day or so, removing links that now have wiki articles. Lugnuts Fire Walk with Me 08:56, 12 April 2020 (UTC)
Lugnuts, the only effect is that updates won’t happen. And you cant make new lists either. —TheDJ (talkcontribs) 09:03, 12 April 2020 (UTC)
What happen is that the hundreds, if not thousands, of editors whose work is assisted by Listeriabot, and by whose consensus it has operated for years, get badly inconvenienced. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 09:00, 12 April 2020 (UTC)
Inconvenience in non-essential tasks is a small price to pay when the alternative is mass copyright violation and bot operator that cannot and/or will not follow basic bot policy. Thryduulf (talk) 09:58, 12 April 2020 (UTC)
Citation needed. Mass copyright violation, REALLY who are you fooling! Thanks, GerardM (talk) 11:27, 12 April 2020 (UTC)
@Thryduulf: There simply isn't mass copyright violation. That's bullshit, and really you are better than this. There are a handful of edge cases, that would be well-handled by renaming the images so they don't clash with the Commons names. (Something we ought to be doing and ought to have been doing anyway). Jheald (talk) 12:02, 12 April 2020 (UTC)
There is nothing stopping the bot as currently programmed from committing mass copyright violations, and given the bot operator does not see this as a problem with the bot, then I'm sorry but the encyclopaedia is better off without the bot. Thryduulf (talk) 12:07, 12 April 2020 (UTC)
@Thryduulf: The bot isn't committing "mass" copyright violations. I don't know if you've looked at the bot's contribution history and scrolled back through the last 7 days to get an idea of the number of projects and contributors using this bot to organise and present their workflows, but it's quite a number. So no, the encyclopedia is not "better off without this bot". The present block is a wildly disproportionate over-the-top response to deal with a tiny handful of edge cases that shouldn't exist anyway if en-wiki had been doing its job. Jheald (talk) 12:37, 12 April 2020 (UTC)
@Jheald: the correct number of copyright violations is zero. Any bot making greater than that many copyright violations is better off blocked regardless of what else it does - if it is important then the bot will be fixed or someone else will code a replacement bot that doesn't violate core policies. It is always the responsibility of a bot operator to ensure that it operates in accordance with policy and consensus, it is never the job of the English Wikipedia to change policies or practices to make allowances for a badly coded bot. Thryduulf (talk) 12:49, 12 April 2020 (UTC)
@Jheald: fixing the ping. Thryduulf (talk) 12:49, 12 April 2020 (UTC)
Others have dealt with your fatuous "mass copyright violation" claim. Who gets to decide which tasks are "non-essential"? Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 15:29, 12 April 2020 (UTC)
A task is essential if (a) the encyclopaedia will cease to function if it ceases, or (b) the encyclopaedia or its editors will suffer real-world harm if it ceases. So removing copyright violations is essential, adding them is not. That applies whether you regard repeatedly introducing multiple copyright violations to multiple pages for multiple years as "mass" or not. Thryduulf (talk) 15:50, 12 April 2020 (UTC)
I note that you didn't answer my question, but instead indicate that virtually nothing is essential, in your own recokoning, and by logical extension it OK to inconvenience - to greatly inconvenience, in this case - anyone and everyone not working on that very limited set of tasks that you deem essential. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 17:58, 12 April 2020 (UTC)
What else could reasonably be deemed essential? Inconveniencing people, greatly or otherwise, is unfortunate but as Headbomb points out this does not make complying with policy optional. Thryduulf (talk) 18:40, 12 April 2020 (UTC)
I need to use Listeria to replace my manually edited page List of Catholic churches in Salvador, Bahia. I can't do this manually anymore, it's too difficult. How do I get started with Listeria? Thanks! Prburley (talk)
That would not have been perimtted even before ListeriaBot was blocked. * Pppery * it has begun... 14:43, 14 April 2020 (UTC)
@Pppery: Why not? It's just too overwhelming to edit these lists manually, and information on historic heritage sites is a vital part of the WP mission--not to mention that they're being damaged or destroyed frequently. Prburley (talk)
@Prburley: Because ListeriaBot is not approved for that. Headbomb {t · c · p · b} 17:09, 14 April 2020 (UTC)
@Headbomb: What's the process for having it approved? The solution below will work, but I think Listeria is what I'm looking for. Prburley (talk)
@Prburley: See WP:BOTAPPROVAL. Headbomb {t · c · p · b} 17:40, 14 April 2020 (UTC)
@Prburley: It could use a bit of tidying-up, but as a basic Listeria list, if you put something like this on a page in your own user-space, Listeria would then update it to give you something like this. Having hand-checked it, you could then copy & paste it to a live page in article space (and similiarly with updates down the line). As Pppery says, Listeria currently isn't allowed to edit or update article space on en-wiki directly (unlike pt-wiki where this week this Listeria list made "featured list" status), but as I understand it, there is no objection to using Listeria to create a list in your own user-space and then copying that to main-space, so long as you have personally hand-checked it, and it is appropriately referenced.
There are various tweaks that could be added to the pretty quick and rough example above -- for example the coordinates could be presented more attractively, a notes column could be added and populating by adding described by source (P1343) statements to the Wikidata entry for each church, some English labels or English sitelinks may be missing, we could probably do better for "location", etc; and there may be entries missing from the list because they don't have items, or aren't currently identified as churches; or don't currently have the right diocese (P708) information. But I hope it gives some idea at least of what is possible. And of course, having curated it for English wikipedia, the data is then also immediately available for anyone wanting to get Listeria to make a version of the page in Portuguese or any other language. Jheald (talk) 15:24, 14 April 2020 (UTC)
Also probably worth noting that the page above was generated on Wikidata. If it was generated on en-wiki, then blue-links would be to en-wiki articles, with a choice of red-links or plain text for items not matched to en-wiki. Jheald (talk) 15:36, 14 April 2020 (UTC)
Version with what Wikidata has at the moment for location (P276) now added [12]. Currently a bit sparse, but could easily be improved. Jheald (talk) 15:33, 14 April 2020 (UTC)
Thank you for your amazing suggestions! I really appreciate it. Prburley (talk)

Forking the bot

Everyone seems to agree that the bot should be updated to fix the issue, but that isn't possible since the operator is inactive and hasn't fixed the issue the other times it's been brought up. The solution then seems to be forking the bot and implementing a patch. Since the source is public on bitbucket it shouldn't be too hard. Anyone who would volunteer to take on the task? ‑‑Trialpears (talk) 10:01, 12 April 2020 (UTC)

@Trialpears: I'm not quite sure who the "everyone" is that you are referring to. I see a number of editors above saying that the more appropriate solution would be to deal with the handful of files with names that shadow Commons, that it would be worth fixing anyway. Jheald (talk) 12:48, 12 April 2020 (UTC)
@Jheald: The point repeatedly made and ignored is that we already do deal with those images, and that no other bot has any issues with the status quo. Thryduulf (talk) 12:51, 12 April 2020 (UTC)
@Thryduulf: If en-wiki already is dealing with these images with shadow names, then the block here is even more a gratuitous unnecessary overkill than was first apparent. If even the handful of problems that have caused this fuss are already getting resolved within a few days, then why block the bot? If the issue is already in hand, and gets resolved routinely, then why all this fuss? Jheald (talk) 13:59, 12 April 2020 (UTC)
Corollary - user is found adding copyrighted content to a page. They are reverted and warned. They do it again, and they are reverted. Repeat ad nauseam. Let's say every time they add the copyrighted content they are reverted within a few days.
After how many reverts and warnings would we block the user? 1? 3? 10? My personal experience says 3, and from what I've seen the bot has previously done this on a single page more than a dozen times.
Yes, the images the bot is trying to place are being removed. HOWEVER, the bot shouldn't be placing them in the first place. It shouldn't be editing in article space (which is a secondary/minor issue being brought up again in this thread). "Living" editors get blocked all the time for this sort of behaviour, regardless of how otherwise useful their edits may be. Thus, it only makes sense to block a bot that is performing in the same manner. Primefac (talk) 14:04, 12 April 2020 (UTC)
@Primefac: Oh I'm sorry, I thought when User:Thryduulf said "we already do deal with those images" he meant that something was actively being done to prevent the problem recurring, by renaming the badly named local images on en-wiki. That fixes the problem. The purpose of this bot is to show what the SPARQL query returns, producing a facility that hundreds of users (or thousands, if you include all wikis) are using. If you're just removing the images from the page, then you're not solving the problem. On the other hand, if you solve the actual problem, by renaming the image (which ought to be renamed anyway), rather than covering up the conflict, then the issue goes away for good, and no change is needed to the bot. Jheald (talk) 14:20, 12 April 2020 (UTC)
At this point I think we're talking past each other. Yes, the images can/should/are being renamed when they are found. I'm not saying that shouldn't happen, because it already happens. But the bot should not be adding them to pages anyway. Why does it have to be one or the other? Why can't it be both? Primefac (talk) 14:36, 12 April 2020 (UTC)
(edit conflict) @Jheald: The reason for the block has been explained several times. The copyright issue has been ongoing and ignored since at least 2017 - that's far, far, far longer than anyone has a right to expect to ignore a problem without sanction, and that ignores the other issues of the bot operator not responding to multiple other complaints about editing beyond its authorisation. There are two ways that shadow images can happen, the first is for a new file to be uploaded to enwp but this does not happen as only sysops can do that and they get a warning about it. The other way is for a new file on Commons to be uploaded with the same name as a file here - there is no way that en.wp can be anything other than reactive to that situation (technologically or otherwise) so we have implemented a mitigation strategy that seemingly works for every single other bot on the project and, as multiple people more knowledgeable than me about coding, have said would be trivial to implement. This is an issue that would not have arisen had the operator of listeriabot operated in accordance with the rules every other bot operator has to follow. If you choose to base your workflow on a bot that operates outside of policy then that's a risk you choose to take. Thryduulf (talk) 14:16, 12 April 2020 (UTC)
@Thryduulf: As you say, there are multiple people more knowledgeable than you about coding. I suspect that such change would not be as trivial to implement as some people may have airily pronounced above (without any diff to back up their assertions), because the lists on the Listeria pages are not being generated from SQL queries that could be extended, but directly from WDQS via a SPARQL query -- and moreover, not a SPARQL query specified by the bot creator, but from whatever SPARQL query the user creating the list chooses to submit. Converting the results of that query (plus a couple of helpful macros) straight to wikitext is pretty straightforward, and makes for a nice clean straightforwardly-coded bot. Having to fish around in those results to see whether any of the columns returned are for a file, then to run a SQL query to check each file for a list that may be several hundred entries long is a significant coding overhead, adding unnecessary messiness and complication, non-negligible additional load on the servers, make it more likely that a particular update may as a result fail for a given list, and make the cause of such failures harder to diagnose.
I submit that that is not worth the candle for an issue which is unintended and transient and having its underlying cause already being taken care of by other bots. In fact, it sounds as if, by surfacing these filename collisions, which are then fixed, the bot in its present form may actually be doing some useful service.
In recent years Magnus has been being extraordinarily productive and creative, constantly producing and refining a non-stop stream of tools that are now underpinning a vast quantity of projects and work across Wikipedias, Commons, and Wikidata, as well as personally maintaining Mix'n'match which has now reached 3,500 different catalogues of identifiers, all being actively matched and cross-referenced. Listeria works. It does what it is meant to, displaying the results of a SPARQL query on a wiki page. If in the process it exposes some bad en-wiki filenames so that they can then be fixed, then so much the better. Given how much Magnus is achieving with his time at the moment, and how many new sorts of work he is making possible, and how useful Listeria is as it currently is, I would not seek to waste one moment of his precious limited time on an issue that is transient, is exposing fixes to filenames that need to be made anyway, and which according to you already get dealt with, when there is so much more he could be achieving doing other things. Jheald (talk) 15:14, 12 April 2020 (UTC)
Your entire argument is predicated on the problems the bot is causing being trivial. They are not. Copyvios, even transiently, are a big effing deal. A bot editing pages it is not authorised to edit is a big effing deal. If Magnus is not able to properly maintain the bot then he shouldn't be operating it, exactly the same as any other bot operator. Why they are not able to properly maintain it is irrelevant. Nobody's time is too valuable to edit in accordance with consensus, and if they think it isn't then they should not be editing at all. Good contributions elsewhere do not, and cannot, justify disruption. Claiming that these errors should be allowed to stand because they cause work for other editors fixing problems that may (or may not) need to be fixed by others is disrupting Wikipedia to prove a point. Thryduulf (talk) 15:43, 12 April 2020 (UTC)
Off-topic discussion. Relevant discussion moved below this close
Your entire argument is predicated on the problems being caused by the bot; they are not, as has been explained to you here and elsewhere, ad nauseam. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 18:08, 12 April 2020 (UTC)
Are you seriously trying to argue that a bot making copyright infringing edits to the encyclopaedia is not a problem with the bot!? I'm sorry but that's the most ridiculous argument I've seen so far! 18:43, 12 April 2020 (UTC) - — Preceding unsigned comment added by Thryduulf (talkcontribs) 19:43, 12 April 2020 (UTC)
I'm seriously suggesting that you don't know what you're talking about. HTH. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 18:49, 12 April 2020 (UTC)
This is a blatant personal attack which must lead to a block, but unfortunately Pigsonthewing is unblockable on the English Wikipedia.--Ymblanter (talk) 19:22, 12 April 2020 (UTC)
That's not the first time you've falsley accused me of making a personal attack; I suggest you desist. To continue to do so would betray a lack of understanding of what constitutes a personal attack. In other words, it would demonstrate that you don't know what you're talking about. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 11:04, 14 April 2020 (UTC)
I think it is pretty clear that you continue your uncivil behavior only because I said several days ago that I am involved with you and will not block you on the English Wikipedia, and you clearly expect that your wikifrends will cover this behavior, as it happened multiple times in the past. It does not make it more civil though.--Ymblanter (talk) 12:11, 14 April 2020 (UTC)
I think it is clear who is being uncivil; and making personal attacks; and bring a grudge from elsewhere; and is not here to contribute. And it is not me. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 14:29, 14 April 2020 (UTC)
Let's all calm down a bit.
Andy: could you please elaborate on how the issues are not "being caused by the bot"? As I see it, there is a very clear link between the bot making edits and non-free content appearing in a manner prohibited by policy. Mdaniels5757 (talk) 19:50, 12 April 2020 (UTC)
As I said; this has been explained previously. The bot makes a good-faith edit, at the request of a random editor, to link to a free image that exists on Commons. This fails, because this project stupidly allows a different, non-free, image to exist, using the same file name. (Notwithstanding that in one case a non-free image had apparently been placed on Commons by someone else; you can no more blame the bot operator or requesting editor for that, than you would a human who inadvertently included it on a page.) Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 11:04, 14 April 2020 (UTC)
@Thryduulf: Wikipedia has a reputation for being sparing with non-free content, which is hard-won and has been achieved with some pain, but is very very valuable and useful. But let's dial the rhetoric back to reality, because over-dramatisation really isn't helpful. In terms of legal risk construed narrowly, such as might "cause the encyclopaedia or its editors [to] suffer real-world harm", that's quite the extreme bit of knicker-wringing above, and we would do better to keep the discussion here rather more grounded in reality. In narrow legal terms, most of what the bot might inadvertantly add due to the filename confusion would probably be protected as fair use or fair dealing anyway, even though it would fall outwith the narrower limits of policy. Moreover, because of the notice-and-takedown protections granted to content hosts, legally the clock only starts running once a notice has been received, if the site is slow on acting on it. That is unlikely, given (i) that, according to you, procedures are in place to fix the name conflicts as soon as they become visible; and (ii) even if they had slipped though, an insta-fix (renaming the file) would be available as soon as any notice was achieved to bring it to awareness. So legal consequences are far-fetched. That leaves the potential for reputational consequences. I in no way dismiss the significance of such consequences just for being reputational -- I think most of us would agree that Wikipedia's overall reputation is orders of magnitude more important than whether we happen to be upheld or not in a single legel case. But I think there is room to differ on whether allowing Listeria to surface a handful of files transiently for a few days until an organised procedure fixes the problem by renaming them (a renaming which I think we agree is desirable anyway in its own right) actually has any reputational significance. I frankly don't see it. Others might differ, but if we do have an organised procedure which is identifying and fixing these filename collisions that have the potential to confuse users, even at the cost of those files as a result sometimes being made visible for a few days where they shouldn't, then to me I think that's actually quite positive to Wikipedia's reputation: we have identified a problem of potential filename confusion, and have an active and effective system in place that helps identify cases and deal with them. To me, that actually seems reputation-positive, rather then reputation-negative. Looked at cold, I don't think there is either a legal or a reputational risk here, so long as the filename issues being exposed are indeed getting rapidly dealt with.
Further on that point, it's increasingly clear that when Listeria surfaces one of these filename collisions, that is actually helpful -- because in general these collisions are rather hard to find, whereas the intersection of files that are non-free and files that are on Listeria pages are rather easy to identify, with a cause that is very clear, making this rather a useful way for them to be surfaced, so that they can then be fixed (which is something we want to do). So it would be helpful if bots that auto-remove non-free content would ignore Listeria pages, so that this files with this very specific issue can be left in place, so that they can then be identified and picked up by the established procedure that is specifically appropriate for them. Shooting the messenger, or killing the canary in the coalmine, is not actually helpful if what we are wanting to do is to identify and fix these collisions.
Finally, it is worth noting that the Listeria currently operates across 71 different wikis (keeping in all over 66,000 different lists actively updated), all from the same code. From a maintenance point of view it is not good design to make the code more complicated than it needs to be. It is not good design to make the bot mask filename problems rather than expose them, so they can be more readily identified and fixed. And, particularly when thinking about when a deep maintenance change may be required (such as recently when the wbterms table was retired), it is absolutely not good design to fragment that single piece of code into a multitude of different scripts, each specialised to a different wiki, that then all have to be updated separately. Such a change is not something to enter into lightly.
Luckily in this case no such change is actually necessary, because (as discussed above) in the present case the way the bot is surfacing filename issues is not just tolerable, it is actually useful, and should be retained. Jheald (talk) 23:41, 12 April 2020 (UTC)
That's a hell of a lot words to say "Please can we knowingly violate the non-free content policy because the bot is useful and fixing it would be a lot of work?". The answer to that can only be "No. Complying with policy is not optional.". Your comment about set the of non-free files on Listeria lists being visible to bots is also interesting, because that requires bots to easily be able to distinguish free and non-free files - a task that those arguing for this policy exception claim is very complicated and/or impossible for Listeriabot. Which is it? You make grand noises about safe harbour, protection from legal harm, etc. but that's not the point at all - those only apply because we take reasonable steps to minimise the likelihood of copyright violations happening in the first place (that's the point of the NFCC), we can't abandon that and still claim protection. Finally, you say that the images appearing in the lists are probably fair use anyway - no, they aren't. A non-free image in a list cannot be being used for critical commentary or parody of the image. Thryduulf (talk) 00:39, 13 April 2020 (UTC)
@Thryduulf: WP:IAR: Understand the purpose of rules, and do what is best for the encyclopedia, rather than apply them blindly.
In this case, where the root problem is the name collision, the community has taken the view that it needs human intervention to hand-choose appropriate new names, so a bot can't fix the underlying problem.
Identifying images that are on Listeria pages and in the non-free category after Listeria has operated (and referring them for human intervention) is comparatively easy -- and, I am putting to you, the preferable option in any case, because then we fix the actual underlying problem. But it relies on Listeria having made the edit, so that the images are on the page, and can therefore be found by the SQL service as being on a Listeria page.
Changing Listeria so it doesn't add the image at all can't use this simple approach, would fragment the Listeria code with the consequences discussed above, and -- the key point -- is undesirable in its own terms because it doesn't end up with the underlying problem of the bad filenames getting fixed.
As to the narrow legal point, what you assert above is simply not the law. Unless you are facilitating piracy on the scale of something like The Pirate Bay, where assisting piracy is the very purpose of the site, the obligation laid on platforms by the law is to deal promptly with asserted copyright infringements as soon as the site is made aware of them by a DMCA notice or its equivalent. It is hugely to Wikipedia's credit, and to the huge benefit of our reputation, that we go way way beyond that, and as a result get very very few infringement notices. But we should look at this with clear eyes. Allowing a file (or at most a very small handful of files) to briefly surface in the wrong place, which we then rapidly fix by renaming the file, thereby definitively removing the possibility of any further confusion between the two files down the line, is a responsible course of action which is not going to damage WP's reputation, or in any way weaken the standing of the NFC policy. So long as the name collision is picked up quickly and then rapidly fixed, there is no more significance here than our practice, say, of leaving files briefly in place and in context while their appropriateness is reviewed at WP:FFD. So long as there is a efficient mechanism in place that is dealing with the name collisions once Listeria surfaces them, as you assure me there is, then Listeria's action is actually helping a useful process, and the prospect of legal or reputational harm by that process is non-existent. Asserting otherwise does not reflect reality. Jheald (talk) 08:19, 13 April 2020 (UTC)
Hat removed. @Headbomb: This is not a red-herring discussion. It directly pertains to what is the right way forward here: what are the actual costs and benefits of the different ways forward that might be persued; and, indeed, whether the bot surfacing these shadow filename issues is actually a problem at all, or whether it may actually be helpful, as a beneficial element of fixing files with this issue. Those are very germane issues, worth wider discussion. Jheald (talk) 10:09, 13 April 2020 (UTC)
Sorry, IAR is only for uncommon situations where the encyclopaedia would definitely be improved. Non-free images in a list is not an improvement to the encyclopaedia, ignoring a rule on an ongoing basis is never acceptable (if the reason for wanting to do so is a good one then you will have no trouble getting consensus to change the rule in such a way that you don't have to ignore it), and it being easier to ignore the rule than comply with it are all reasons to say "no, you may not ignore this rule". Thryduulf (talk) 11:15, 13 April 2020 (UTC)
The fact that this non-issue was ignored for so long, is telling us that it is not an issue. The underlying issue needs to be solved better, but it is already sufficiantly taken care of so as not to be a real problem. I suspect the whole storm is not whipped up because of the non-free images issues, but using that as an excuse to shut the bot down because some people do not agree with the source of its data. If we want to have a functioning wikipedia in 10 years time we need to be foreward looking. Looking forward means also not to close your eyes to the solution. Agathoclea (talk) 12:01, 13 April 2020 (UTC)
@Agathoclea: I can't speak for everyone of course, but I'm one of the most vocal in support of this bot's block but I'm also a strong supporter of Wikidata and of the benefits its provices. The reason I'm making a fuss now is that this is the first time I've been aware that the non-free files problem has existed. Unlike the bot controller who has been aware since at least 2017. Thryduulf (talk) 12:15, 13 April 2020 (UTC)
"to say 'Please can we knowingly violate the non-free content policy..?'". That is not what is being said, and it is wrong and misleading of you to suggest that it is. What is being said is more like "If a bot very occasionally and inadvertently causes a thumbnail of a non-free image to be shown on a non-mainspace page, because this project stupidly allows file names that duplicate those of different images on Commons, please can we deal with that in a sensible manner, by renaming the errant image, rather than damaging the project by hysterically over-reacting and blocking the bot". Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 11:16, 14 April 2020 (UTC)
You want the bot to be allowed to display a non-free image on a non-mainspace page. Displaying non-free images on non-mainspace pages is explicitly against the non-free content policy. You therefore want the bot to be allowed to violate the non-free content policy. You can try to weasel out of it by blaming others for not making fundamental changes to the core software, writing additional bots and/or taking other actions that mean this one bot wouldn't need to be fixed, but that does not alter the fundamental nature of your request. Thryduulf (talk) 14:22, 14 April 2020 (UTC)
I've just told you that your claim was wrong and misleading, and your response is to make another claim that is false and misleading? I want nothing of the kind. I want - as I just said - us to resolve such rare issues in a sensible manner, by renaming the errant image, rather than damaging the project by hysterically over-reacting and blocking the bot. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 14:35, 14 April 2020 (UTC)
No, you asserted that my claim was false and misleading. I simply demonstrated that it was accurate. It's true you only want to be allowed to do it temporality, but that doesn't change that you want to do it all. Yes you want the "errant" image renamed, but that request is being discussed at WT:NFC and is unrelated to what is being requested here. What is being requested here is permission to display the non-free image until it is renamed to something that does not shadow a free image from Commons. Doing that would require an exemption from the policy prohibiting the display of non-free images outside mainspace. Preventing images on en.wp and Commons having the same file name (the only way that shadowing can be prevented) would require a change to MediaWiki software - something that is completely outside the power of en.wp to implement and so irrelevant for the purposes of this discussion. Thryduulf (talk) 15:42, 14 April 2020 (UTC)
Note The above discussion has nothing to do with possible forks of ListeriaBot, do not re-open. Move it to WP:VP if you have to. Headbomb {t · c · p · b} 15:33, 13 April 2020 (UTC)
@Headbomb: It's not for you to close a discussion you are party to.
And yes, it is absolutely appropriate to look at the negative consequences that could arise from forking ListeriaBot, as well as the practicality of the suggestion. That's why we have discussions here, so people can raise and work through exactly such points.
So, reverted. Jheald (talk) 15:36, 13 April 2020 (UTC)
I am not a "party to this discussion", I'm a BAG member and as a BAG member, I can tell you that nothing above is pertinent to a forking discussion, nor do they tackle the "costs and benefits" of forking. There is basically three paths forward a) fixing ListeriaBot, which requires Magnus to communicate and update their code b) forking it so someone else can update the code and run it instead of Magnus c) fixing all filename collisions and preventing them from happening in the future. A) would be quick if Magnus gets around to fixing their code and let us know they've done so. B) Can be quick, someone just needs to fork the publicly-available code and make a WP:BFRA C) is the least-likely to occur, given it involves identifying and fixing all collisions, and preventing them from happening in the future. c) is not impossible to do, but it's a much bigger and slower effort than either a) or b) solutions. Headbomb {t · c · p · b} 15:34, 14 April 2020 (UTC)
You being a member of BAG gives you no authority to manage the discussion here as you have attempted to do, including closing discussions immediately after you have participated in them. You have just closed another section - [added] to which you were very much a party - where someone has falsely accused me of trolling, after I pointed out they were making provably incorrect claims, leaving me no avenue to refute that accustaion. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 16:46, 14 April 2020 (UTC)
Refute it elsewhere. The conversation was going nowhere and is cluttering up the more-valid discussions about the bot and its activities. Primefac (talk) 16:53, 14 April 2020 (UTC)
@Headbomb: Other paths might include:
d) recognising that these NFC glitches are few and far between, get flagged for clearance pretty quickly already by Green C bot, and are essentially harmless.
e) creating a script specifically to identify and mop up these occasional glitches, perhaps along the lines suggested here, to run either periodically, or soon after each Listeria edit
f) some other approach that some experienced bot author may suggest here.
It is valuable to flag and recognise that complicating or forking Listeria are not zero-cost options. Listeria has infrastructure-level significance across 70 wikis, allowing any arbitrary SPARQL query to immediately be presented as a fully-formatted list on the wiki, that will then be kept updated. This currently supports 65,000 live pages across those wikis, including wide use for project management, wide use for individual curation projects by individual users, and wide use to present the results of collaborative curations with external partner institutions. So this is code that it is important to keep as clean and maintainable and unified as possible. It is not code to mess around with lightly, not code to add complexity to unnecessarily; and it is code to avoid fragmenting as far as we possibly can, so that any fixes or updates immediately apply everywhere, and so that it continues to reliably work in the same way with the same syntax wherever it is used, so that the pages calling Listeria can continue to be portable between one wiki and another immediately without change.
Do these considerations trump all others? Not necessarily. But they are not small things either. Jheald (talk) 16:57, 14 April 2020 (UTC)
Those are all variations of c). Headbomb {t · c · p · b} 17:07, 14 April 2020 (UTC)
Not really. (d) suggests that this whole issue is a de minimis trifle, and not worth further worrying about; (e) suggests leaving Listeria unchanged, but identifying the issues case-by-case as they come up on the fly, which is rather different to up-front mass-fixing all filename collisions, and could be rather smaller and easier than (a) or (b); (f) might be something entirely different again. Jheald (talk) 17:22, 14 April 2020 (UTC)
That was going to be my next suggestion - to extract out the code that does the wiki-list updates and run that via another bot. I have no experience in this area, but I'm sure it can be done. Advanced thanks to anyone and everyone who can help with this. Lugnuts Fire Walk with Me 11:56, 12 April 2020 (UTC)
This is just a suggestion. I think the tool can be placed in Maintainer needed section in Phab. Adithyak1997 (talk) 12:09, 12 April 2020 (UTC)
If someone wishes to fork out the bot and put through a new BRFA, please do. Having an active bot operator and a clearly-defined bot task is much preferred over the current situation. Primefac (talk) 13:28, 12 April 2020 (UTC)

Break

@Anomie, Jarry1250, TheSandDoctor, and Dreamy Jazz: sorry to bother you guys, I've pinged you since you four are the only active bot operators with experience in PHP that I know of and I was wondering if any of you would like to help out forking the now blocked Listeriabot. The problem is that it can accidentaly display non-free images where fair use doesn't apply when a non-free file on English Wikipedia shadows a file on commons the bot is trying to include in a list. Source code is available at BitBucket. My suggested implementation would be getting the intersection of Category:Wikipedia files that shadow a file on Wikimedia Commons and Category:All non-free media at the beginning of the run and then check that each added image is not on that list simply not adding it if it is. Thanks for considering it! ‑‑Trialpears (talk) 22:47, 14 April 2020 (UTC)
Trialpears, thanks for your efforts to find a new operator. Best, Barkeep49 (talk) 01:57, 15 April 2020 (UTC)
I have only programmed bots using python (pywikibot), but I'll give the source code a look over. Dreamy Jazz 🎷 talk to me | my contributions 08:36, 15 April 2020 (UTC)
I think using Category:Wikipedia files that shadow a file on Wikimedia Commons may be unreliable. It is added via a bot weekly so an image might not be detected if using this category for a week. Therefore, I think it would be best to have the bot check for the non-free category on the local version (if a file with the same name on the wiki where the image will be posted exists, check if it is in Category:All non-free media (or similar for other wikis)). Dreamy Jazz 🎷 talk to me | my contributions 09:03, 15 April 2020 (UTC)
Perhaps this issue can be mitigated by making the GreenC bot task a daily one ... GreenC, do you think that can be done? Jo-Jo Eumerus (talk) 09:08, 15 April 2020 (UTC)
Yes and multiple times a day is also available. -- GreenC 13:01, 15 April 2020 (UTC)
@GreenC, Jo-Jo Eumerus, and Dreamy Jazz: If we catch shadowed files more quickly by Green C bot, and perhaps auto-move them (even as a temporary measure while adding another tracking category for later human checking), would that solve the specific issue with this bot and mean that it could start running again as-is? It doesn't solve the wider problem I was trying to address with the RfC, but for this specific issue? Thanks. Mike Peel (talk) 17:33, 15 April 2020 (UTC)
Well, not as it stands, because AFAIK GreenCbot only tags the files, it wouldn't stop Listeria adding them to the lists. Unless, as you say, GreenCbot could be rewritten to move them as it finds them - though that would need another BRFA (which would probably be very quick). Black Kite (talk) 17:53, 15 April 2020 (UTC)
@Black Kite: I can trivially write a bot that would do the moves (as I already offered at the RfC), and would be happy to do so. Thanks. Mike Peel (talk) 17:59, 15 April 2020 (UTC)
So you'd have to have a suite that runs GreenC bot -> move bot -> Listeria. You couldn't leave time between them (and even if they ran consecutively then you still might get the odd edge case). You know, this is all great, but it does make me think that the only way of properly fixing the issue is to fix Listeria, especially as the fix is quite basic. Black Kite (talk) 18:08, 15 April 2020 (UTC)
@Mike Peel: No; not always is moving the file the correct solution and how does the bot know what new name to use? Increasing the rate at which GreenC bot tags the files (as discussed by GreenC above) makes it easier to resolve the problems before Listeriabot trips up on them, though. Jo-Jo Eumerus (talk) 18:27, 15 April 2020 (UTC)
I think a variation of this could be a possible solution. First of all having a separate bot to temporarily move them is not a good idea. GreenC is active and if this is to be implemented GreenC bot should do it. That avoids most of the timing issue. Secondly if it is to move files it should just append something like (local) and put a template on the description page to indicate that someone should review the move. If that is done I think this is a fine solution since finding someone to fork the bot turns out to be quite difficult. ‑‑Trialpears (talk) 18:58, 15 April 2020 (UTC)
@Black Kite and Jo-Jo Eumerus: So we move the file and add it to a category for human checking to check that it was the correct solution (and to either approve it if it was correct, or fix it if not)? Doing it immediately via GreenC bot would be optimal, but I could run a move script via pi bot as often as you like, with the caveat that the more often it runs the more server resources it uses. I code pi bot using python/pywikibot, so I'm not sure that's compatible with GreenC bot's code directly. The only solution to 'fix Listeria' that I've seen is for it to run an additional check for *every* image it includes *every* time it runs, which involves a hell of a lot more server resources than fixing the edge cases. Thanks. Mike Peel (talk) 19:03, 15 April 2020 (UTC)
GreenC bot can move pages trivially, if required. There is another bot task related Wikipedia:Bot_requests#Follow_up_task_for_files_tagged_Shadows_Commons_by_GreenC_bot_job_10 by User:Philroc but I don't think they took it to BRFA yet. It is a complicated because what happens is someone uploads a file to Commons, it triggers a shadow match by GreenC bot, then Commons deletes it as a copyvio and the shadow no longer exists. So Philroc's bot removes the shadow tag added by GreenC bot if there is no longer a shadow happening ie. the file no longer exists on Commons. If I recall, one reason GreenC bot is running less than daily is to give time for the shadowing to sort itself out, for Commons and Enwiki to resolve copyvios and deletions before declaring a shadow exists. -- GreenC 19:36, 15 April 2020 (UTC)
It's still very much a WP:CONTEXTBOT situation, there is no simple rename pattern and the ones I use are something I make up on the fly. An automatic file rename is not really the way to go here. Jo-Jo Eumerus (talk) 19:43, 15 April 2020 (UTC)
@Jo-Jo Eumerus: That's new to me, and seem crazy. However, the examples don't seem to apply to this situation, I'm just proposing a bot that moves files from one maintenance category to another here, they would still need human intervention (the RfC is wider, but that's what not what we're talking about here). THanks. Mike Peel (talk) 20:09, 15 April 2020 (UTC)
As a practical demonstration, [13] was by bot [14]. Thanks. Mike Peel (talk) 21:11, 15 April 2020 (UTC)
If we are to pursue the route of GreenC bot automatically moving shadowed files to a temporary file name I would suggest the following workflow: GreenC bot starts with appending (temporary) to the file name to temporarily solve the shadow problem. GreenC bot then adds the following template based on {{Shadows commons}} and associated tracking category to the description to explain the issue. Editors continue to review these files as normal but without having a name conflict in the meantime. Do you think this would be a workable system Jo-Jo Eumerus? If so I think we should see what other people working with shadowed files to see if there are any issues and then implement it. ‑‑Trialpears (talk) 22:05, 16 April 2020 (UTC)
@Trialpears:Honestly, I think this would be a waste of effort. Quite aside from the fact that there is no universal pattern for new file names - meaning that one would have to re-rename the file once again in many if not most instances - in many instances file moving is not the correct solution at all. Jo-Jo Eumerus (talk) 08:50, 17 April 2020 (UTC)

I think I'll leave this to a bot operator who is running php bots already. If no one wants to take up the task, I can, but it might be a while before I have working code. I'll probably do a rewrite in python using pywikibot (my experience with PHP is limited to websites). I think I do have some kind of way to fix this in the source. In "shared.inc" add a call to a function to check if the file stored in the property is shadowed on line 958ish in the method "renderCell". This new function loads the properties for the filename on the wiki the bot is currently processing and checks if the page ID for that file is zero / if the file exists. If it is, then the image is not shadowed. If the page ID is not zero / the file exists, then the file is shadowed. Then check if the shadowed image is in Category:All non-free media, if it is then don't add the image and if it isn't then still add the image. I think any person wanting to run a fork of this bot will need to fork / download both the listeria and magnustools repos from bitbucket, as listeria uses magnustools. Dreamy Jazz 🎷 talk to me | my contributions 11:03, 15 April 2020 (UTC)
It's probably appropriate to add a ':c:' to replace the file with a link to the file on Commons whenever a Commons file is shadowed, without the check of Category:All non-free media, which is likely to be more expensive, and more likely to change from wiki to wiki. Jheald (talk) 11:24, 15 April 2020 (UTC)
It's a shame that it's not possible to use something like [[c:File:blah.jpg]] to always use the Commons file (it seems that just results in a link to the file). Thanks. Mike Peel (talk) 17:35, 15 April 2020 (UTC)
That would be useful but it's not something that can happen without developer input, so it needs to be requested at Phabricator (if it hasn't been already, if it has a link from here would be useful). It's also not something that is likely to happen with any great rapidity so it's almost certainly going to be quicker and easier to just fix Listeriabot. Thryduulf (talk) 19:53, 15 April 2020 (UTC)
@Thryduulf: I agree that this solution would need mediawiki developer input. I've proposed my preferred solution above - I still don't see any quick or easy solution that lets us fix Listeriabot without simultaneously solving the bigger issue of non-free files shadowing free files. Mike Peel (talk) 20:15, 15 April 2020 (UTC)
Even if you think that Listeria checking for non-free isn't a good idea (I'd disagree, but whatever), a trivial way of fixing Listeria so it could start working again would be for it to insert a link to the image in the list, rather than the image itself. Black Kite (talk) 21:09, 15 April 2020 (UTC)
I am now working on a rewrite in python. Dreamy Jazz 🎷 talk to me | my contributions 22:22, 15 April 2020 (UTC)
However, this rewrite will take some time (probably long enough that the problems here are fixed). Dreamy Jazz 🎷 talk to me | my contributions 22:54, 15 April 2020 (UTC)
The rewrite is going faster than expected. I may be done in the next few days. I'll file a BRFA when I'm done if the bot has not been unblocked yet / the issues are fixed. Dreamy Jazz 🎷 talk to me | my contributions 23:06, 16 April 2020 (UTC)
There can be a description page without a file (for example File:Australia satellite plane.jpg, page ID 1258205) - https://en.wikipedia.org/w/api.php?action=query&titles=File:Australia_satellite_plane.jpg&prop=imageinfo has "imagerepository": "shared" which would be "local" for a file that exists in Wikipedia. Peter James (talk) 14:24, 16 April 2020 (UTC)

"Inactivity" of the operator

This is going nowhere. I will leave this un-hatted so participants can read it, but the consensus here is that the bot operator is essentially inactive on the English Wikipedia, which is the salient point for an en-wiki bot. Primefac (talk) 15:08, 14 April 2020 (UTC)
I'll pre-emptively add that whether two-months without an edit on Wikipedia qualifies as 'inactivity' is immaterial. The point here is that WP:BOTCOMM matters, and the expectations of the English Wikipedia community is that bot operators may not ignore communications that occur on the English Wikipedia, or require that issues are raised on a different forum. Headbomb {t · c · p · b} 15:20, 14 April 2020 (UTC)
The following discussion is closed. Please do not modify it. Subsequent comments should be made on the appropriate discussion page. No further edits should be made to this discussion.

Split from #Forking the bot above.

"the operator is inactive" Maybe you could get a clue about what your fellow volunteers contribute, before you disparage them with such ignorance? Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 18:04, 12 April 2020 (UTC)
@Pigsonthewing: He could be sweating like a galley slave elsewhere, but as far as en.wp is concerned, he hasn't been active since (early) February. Which means: as afar as leaving a talk-page message in the traditional fashion goes, the likelihood of receiving a reply is receding rather than improving. HTH. ——SN54129 18:40, 12 April 2020 (UTC) ——SN54129 18:40, 12 April 2020 (UTC)
But that wasn't the claim made. Regarding your latter point, have you looked at his talk page? It appears that not one of the people loudly complaining about the bot has taken heed of the guidance there. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 18:47, 12 April 2020 (UTC)
Unfortunately, not many editors here know (or probably care!) about bitbucket; why should they? Afterall, en.wp has plenty of ways and places itself for on-wiki communication. Specifically, saying Beats spreading [messages] over half a dozen talk pages is slightly disingenuous: there's only one talk page he needs to worry about, and it's that one. ——SN54129 19:04, 12 April 2020 (UTC)
@Pigsonthewing:, it has been a longstanding principle of bot operation that enwiki users don't need to register elsewhere to take their concerns to a bot operator. Magnus hasn't been active on enwiki in 2 months. While the threshold of what exactly is "inactivity" will differ from people to people, saying that Magnus has been inactive in the past two months is hardly "disparaging them" or being "ignorant". So instead of complaining here that enwiki editors prefer to keep enwiki issues on enwiki, you could contact Magnus on BitBucket yourself if you think this will lead to a speedier resolution. Headbomb {t · c · p · b} 00:02, 13 April 2020 (UTC)
{{citation needed}} And don't attempt to close discussions immediately after posting to them. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 11:13, 13 April 2020 (UTC)
Citation: WP:BOTCOMM: "Bot operators should take care in the design of communications, and ensure that they will be able to meet any inquiries resulting from the bot's operation cordially, promptly, and appropriately. This is a condition of operation of bots in general. At a minimum, the operator should ensure that other users will be willing and able to address any messages left in this way if they cannot be sure to do so themselves." WP:BOTACC: "All policies apply to a bot account in the same way as to any other user account." Thryduulf (talk) 12:26, 13 April 2020 (UTC)
Those are indeed citations. Just not for the claim that was made. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 15:00, 13 April 2020 (UTC)
Those citations support the claims immediately preceding your request for citations. If you were requesting citations for something else then you need to actually specify what that something else is, we cannot read your mind. Thryduulf (talk) 20:05, 13 April 2020 (UTC)
"Those citations support the claims immediately preceding your request for citations" They do not. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 10:52, 14 April 2020 (UTC)
Were he the operator of just one bot, on just one project, your point might have a shred of validity. As he is not, it does not. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 11:14, 13 April 2020 (UTC)
Incorrect, the same rules apply to everybody: if you want to operate a bot on the English Wikipedia you must be available to respond to issues on the English Wikipedia. What the operator does or does not do on other projects is irrelevant. If an operator is unable or unwilling to deal with issues related to their bot on the English Wikipedia then they will have their operator privileges for the English Wikipedia withdrawn, regardless of why they are unable or unwilling to follow basic policy. Thryduulf (talk) 12:20, 13 April 2020 (UTC)
Poppycock; try reading the post I was replying to. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 12:24, 13 April 2020 (UTC)
I read, and have re-read, the post you were replying to. There is only one talk page he has to worry about regarding the English Wikipedia. If he has to pay attention to other talk pages for other business that's his choice, but it doesn't make either my or Serial Number 54129's posts incorrect. If there is too much for him to keep track of then he needs to either stop something or hand it over to someone else who can resolve the issues. 20:05, 13 April 2020 (UTC)
I note that the claim you now make is not the claim to which I replied; both 54129's and your earlier post are incorrect. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 10:52, 14 April 2020 (UTC)
Andy, I am making the same claim in both posts just using different language because you apparently misinterpreted it the first time. Both make the same claim that I understand SN54129 was making. Additionally your comments in these discussions are getting increasingly towards a style of "I'm disagree with something you said, but I'm not going to tell you what it was or why I disagree with it, because I'm right and you are wrong." This is not how to resolve a dispute. Thryduulf (talk) 14:16, 14 April 2020 (UTC)
You are indeed making the same claim more than once. However it is different to the claim made by 54129, which you wrongly said I was incorrect to describe as not valid. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 14:56, 14 April 2020 (UTC)
On the asumption you've indented correctly, yes you were relying to me. Incorrectly. As, I repeat, it does not atter how many bots he runs or where he does so: what he does on the Engish Wkipedia will be discussed on the Engish Wikipedia, there are literally no other two mays about it. You are either accidentally or deliberately misunderstanding what (multiple) editors are telling you; since you are clearly competent, it can only be assumed that the latter applies. ——SN54129 14:28, 14 April 2020 (UTC)
I see that you, too, are now making a different claim to the one I originally described as invalid. The confusion, deliberate or otherwise, is not mine. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 14:56, 14 April 2020 (UTC)
I can't imagine wy you feel the need to troll the discussion, but, here we are. ——SN54129 15:05, 14 April 2020 (UTC)

The discussion above is closed. Please do not modify it. Subsequent comments should be made on the appropriate discussion page. No further edits should be made to this discussion.

Recent coronavirus-related publicity for the bot

See this Hacker News thread on a Listeriabot-created list, of Wikidata-notable people who have died from the Coronavirus. It would be helpful if we could get this issue resolved so that the bot can continue keeping the list up-to-date; it is of current interest and rapidly changing. —David Eppstein (talk) 19:20, 13 April 2020 (UTC)

David Eppstein, That list is on wikidata and thus is not effected by the bot being blocked on English Wikipedia. The underlying issue doesn't really effect other sites since non-free content isn't used to the same extent (or at all) on other wikis. ‑‑Trialpears (talk) 20:01, 13 April 2020 (UTC)
Yes, I was just coming back to post a clarification about this. I'm skeptical that other Wikipedias don't use non-free content, but maybe one fix would be to migrate the various Listeriabot redlink lists to Wikidata? Or would that be unacceptable as the redlinks are targeted at a specific Wikipedia (the one for which they are redlinks)? —David Eppstein (talk) 20:04, 13 April 2020 (UTC)
I guess that would work as a temporary measure if something is important to have updated several times a week. I am quite confident that this will be resolved within a week and everything will be back to normal. ‑‑Trialpears (talk) 21:04, 13 April 2020 (UTC)

What a crock. Listeria does not link to any specific red links OTHER than to the red links on that very Wikipedia. Remember the same query will result in the same content. However, red links are different. Check out the COVID-19 deaths Listeria list on the Dutch Wikipedia. English Wikipedia is one of the exceedingly few Wikipedias that supports non-free imagery. This disaster is one of your own making. Thanks, GerardM (talk) 16:00, 14 April 2020 (UTC)

Gerard, this is factually incorrect. Among bigger projects, only Dutch, Spanish, and Swedish Wikipedias disallow fair use, and German is very restrictive. It is by far not the majority.--Ymblanter (talk) 16:20, 14 April 2020 (UTC)
And by self selecting you change the argument. We are talking Wikipedias not the biggest Wikipedias. We are talking about the qualities of English Wikipedia, it argues how important copyright is without explaining WHY this mode of operandi is actually effective and addresses a threat. It is easy enough to change the routine and NOT have local files take precedence. We could discuss four percent error rate that is dismissed because it is not opportune; it will change the outcome of this argument while improving quality to our readers. It is argued that this will blow over and ask yourself, what is it that Wikipedia stands for.. It has a dogmatic establishment unable to reflect on its strengths and weaknesses when challenged. — Preceding unsigned comment added by GerardM (talkcontribs) 03:54, 15 April 2020 (UTC)
This is the English Wikipedia, and bots that operate on the English Wikipedia must follow English Wikpedia policies. What's done on the Dutch Wikipedia is irrelevant here, and this attitude that enwiki only has itself to blame, or whatever the above is supposed to be, is unproductive at best. Headbomb {t · c · p · b} 04:46, 15 April 2020 (UTC)

That list is not on Wikidata - find it on nl.wp

The listeria list may be found here .. it is maintained by the ListeriaBot. Thanks, GerardM (talk) 10:59, 16 April 2020 (UTC)

How is that relevant to this discussion? Thryduulf (talk) 13:42, 16 April 2020 (UTC)

Hatting of comments

Off-topic (non-admin closure) ——SN54129 18:53, 14 April 2020 (UTC)
The following discussion has been closed. Please do not modify it.

My reply to Thryduulf that "Your entire argument is predicated on the problems being caused by the bot; they are not, as has been explained to you here and elsewhere, ad nauseam." is nether off topic nor irrelevant, but has been included in a section collapsed as such; including by an editor with whom I am in disagreement on that point, and by an involved admin who blocked the bot in question. Another section has been closed in a most partisan manner, by an editor whose earlier closure of that section was also reverted after he tried to use his closure to have the last word. It really is unacceptable for people on one side of a discussion to try to manage it in this manner. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 17:20, 14 April 2020 (UTC)

Oh wait, he didn't make the latest close to the latter; he just insrtered his comment after it was closed. Can we all do that, or is that too only for people on one side of the discussion? Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 17:41, 14 April 2020 (UTC)
The above is not off-topic, and its hatting by one of the people whose actions are describled amply illustrates the point. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 19:18, 14 April 2020 (UTC)

It's been a week

It's been a week since ListeriaBot was (re-)blocked. Are we any closer to unblocking it or improving on it? As far as I can see, there are a few options:

  1. Unblock the bot so it can continue to operate as normal. Pro: lists here will continue to be updated as normal. Con: The file shadowing issue may reoccur.
  2. Move all local non-free files to a new filename. Pro: would solve this issue, as well as avoiding all other cases where a non-free file shadown a free file. Con: seems to be controversial
  3. Move non-free files that are blocking free files to a new filename. Pro: This would solve the issue. Con: Still controversial, wouldn't solve the wider problem.
  4. Wait for a new bot operator to rewrite it so that it avoids non-free files. Pro: This would solve the issue. Con: No-one has demonstrated that this is technically possible, and it still wouldn't solve the wider shadowing problem.
  5. Continue blocking it. Pro: This would solve the issue. Con: This stops us using lists from Wikidata to improve our content.

How should we move forward here? Thanks. Mike Peel (talk) 22:28, 17 April 2020 (UTC)

You forgot #6: Have the bot operator fix the issue, but since that seems like a non-starter given their complete lack of participation in this discussion, at the moment we're somewhere around 4 and/or 5. I have zero inclination to unblock a bot that we know can and will break our policies if given the correct circumstances. Primefac (talk) 22:47, 17 April 2020 (UTC)
@Primefac: That fell under "No-one has demonstrated that this is technically possible" (or perhaps "technically sane" - unless an "additional check for *every* image it includes *every* time it runs". is seen as sane). Mike Peel (talk) 22:51, 17 April 2020 (UTC)
So far, we're at #5 by virtue of #1 being a no-go. Unblocking a malfunctioning bot should not happen. As for which of #2/3/4 happens next, that's mostly up to the community. But #4 is clearly possible, and it's a fairly trivial thing to implement for anyone with coding skills. Headbomb {t · c · p · b} 23:46, 17 April 2020 (UTC)
Not to mention the even more trivial change that could be implemented - even as a stop-gap - which would be for the bot to insert a link to the image, rather than the image itself, in the lists. Black Kite (talk) 00:13, 18 April 2020 (UTC)
Options 4, 5, and 6 are the only options that would ever achieve consensus. Unblocking the bot as-is is not an option. TonyBallioni (talk) 23:56, 17 April 2020 (UTC)
I'm with Primefac here - the operator hasn't participated in the discussion at all, despite not being globally inactive with their last global edit being this week ((caution-slow link) - and they aren't asking for their bot to be unblocked - so there is no pressure here. No one should ever depend on someone else making a future edit, including edits they may make via their bot. — xaosflux Talk 02:25, 18 April 2020 (UTC)
For the reasons explained in the discussion at WT:NFC not only are 2 and 3 controversial, it's debatable at best whether they will actually solve the issues meaning that options 4-6 are the only viable ones. I fully agree also with Xasoflux that there is neither a rush nor a deadline. Thryduulf (talk) 10:29, 18 April 2020 (UTC)
I am aiming to do No. 4. I am rewriting the bot in python (as I have written bots in python before and I find it easier to deal with in a wiki context). I have managed to rewrite over half of it. I was going to submit a BRFA, but I thought it would be best to finish the code before I did. Dreamy Jazz 🎷 talk to me | my contributions 12:47, 18 April 2020 (UTC)
Dreamy Jazz, just noting (as I close this section...) that the bot has been fixed per the section below. Thank you for all the effort you've put in, but it looks like your services are no longer needed. Primefac (talk) 23:00, 18 April 2020 (UTC)

The discussion above is closed. Please do not modify it. Subsequent comments should be made on the appropriate discussion page. No further edits should be made to this discussion.

Update (botop)

OK, I was just made aware of this discussion (I am running ListeriaBot).

  • I just blocked namespace 0 editing on enwiki for the bot, as I have done for dewiki and frwiki on request before.
  • Blocking accidental inclusion of non-free images in user namespace would take a bit longer. Maybe tomorrow.
  • I didn't read most of the above. It seems like if someone had told me in the usual places what the issue is, it would have been resolved early and easy. Though I suspect the point of this discussion is not to prevent a handful of accidental edits, but to badmouth Wikidata. This makes me sad. --Magnus Manske (talk) 17:27, 18 April 2020 (UTC)
    As the person who opened this discussion I said nothing bad about Wikidata. In fact I said I wanted to find a way for this bot to continue operating. And I'm not sure how posting to both the bot's user talk and your user talk is not a "usual place". I get that it's not something you check regularly, but with respect that is a problem for us on enwiki. I get why it's a choice you've made and I'm not saying you're wrong to act that way, only that it is a problem for us as a wiki. Best, Barkeep49 (talk) 17:32, 18 April 2020 (UTC)
@Magnus Manske: at lot of the above has a lot of frustration caused by a block that threw the baby out with the bathwater, and people who want to ignore policy because the bot does a lot of good. That said, I see nothing above that claim Wikidata is evil, so just ignore the general orneriness and focus on the actual issues. One the bot's logic gets updated, ask for the unblock, and things should get back to normal. I will point out that the bot's talk page is the proper place to raise issues about malfunctioning bots on Wikipedia, so if you don't monitor that page regularly, I suggest enabling email notifications when someone leaves you a message on User talk:ListeriaBot. Also feel free to review the recently updated WP:BOTCOMM, which makes explicit was was implicit before. Headbomb {t · c · p · b} 17:39, 18 April 2020 (UTC)

Update: I added a safeguard that should prevent ListeriaBot from using images that are "local" (not from Commons). Of course, I can't test it here, because the bot is blocked, and it's hard to find an actual example (the ones listed as links above turn out to be not applicable). --Magnus Manske (talk) 22:45, 18 April 2020 (UTC)

Thank you Magnus Manske for doing all of that. I have unblocked the bot, and if there are any small-scale tests you need to run feel free to do so. Primefac (talk) 22:58, 18 April 2020 (UTC)

Test page suppressing the Flag of Japan, as per requirements. --Magnus Manske (talk) 10:49, 19 April 2020 (UTC)

Thanks to everyone who helped get this resolved. Lugnuts Fire Walk with Me 13:09, 21 April 2020 (UTC)