Commons:Village pump
This page is used for discussions of the operations, technical issues, and policies of Wikimedia Commons. Recent sections with no replies for 7 days and sections tagged with {{Section resolved|1=--~~~~}} may be archived; for old discussions, see the archives; the latest archive is Commons:Village pump/Archive/2024/11. Please note:
Purposes which do not meet the scope of this page:
Search archives: |
Legend |
---|
|
|
|
|
|
Manual settings |
When exceptions occur, please check the setting first. |
SpBot archives all sections tagged with {{Section resolved|1=~~~~}} after 1 day and sections whose most recent comment is older than 7 days. | |
September 23
Hosting HDR images as JPEG with gain map
The tools for creating and displaying High Dynamic Range (HDR) images are starting to mature. HDR displays can render much brighter highlights than before, which leads to a big qualitative improvement in an image. Software for HDR production, and web-browser support, are becoming wide-spread. (Note that this is distinct from the tone-mapped HDR images you may have seen for the past decade or so.)
This post is partly a response to User:Hym3242 and User:PantheraLeo1359531 in Commons:Village pump/Archive/2024/08#Can I upload bt2020nc/bt2020/smpte2084(PQ) HDR AVIF images to commons and use them in wikipedia articles?. I was wondering the same thing, so I uploaded a couple files to see how well Commons would support them. They are formatted as JPEG with a gain map. The promise of this format is that it is backward-compatible with systems that process and serve standard JPEG. The base image is a JPEG, usable on any device. HDR information is inserted in the file as metadata. In the worst case HDR metadata is lost, resulting in a standard image. In the best case HDR metadata is preserved, the end-user has an HDR-capable display and web browser, and the image looks great.
My test results are at Category:HDR gain-mapped images. Both images survived the process of uploading and rendering previews. HDR metadata was stripped from preview images, but preserved in the original uploads. If you have a newish HDR screen and a compliant web browser, the originals of this house and this church will appear brighter than usual. The effect on the house is subtle, limited to where sunlight hits white paint. The effect on the church is more dramatic: the windows should appear much brighter than the rest of the interior.
Most users of Commons images will see one of the smaller standard files, so for now the benefits of publishing this sort of content are limited. Are there any downsides to publishing it on Commons?
This post isn't marked as a proposal, because hosting these images on Commons works already. At a later date, when the standards are settled and the hardware is widely available, it would be nice to preserve HDR metadata in the generated preview images. — Preceding unsigned comment added by Semiautonomous (talk • contribs) 23:51, 23 September 2024 (UTC)
- A phab task would need to be created for "include gain map of images into thumbs"- C.Suthorn (@Life_is@no-pony.farm - p7.ee/p) (talk) 07:41, 1 November 2024 (UTC)
October 14
Google's semi-censorship of Wikimedia Commons must end
Please see meta:Community Wishlist/Wishes/Do something about Google & DuckDuckGo search not indexing media files and categories on Commons. I think we can and should do something about Google not indexing most files (including all videos) and category pages on Commons. Prototyperspective (talk) 15:42, 14 October 2024 (UTC)
- It is a private company and if not violating the law, they can do whatever (...) they want. If they choose to ignore stuff on commons - that´s fine. Alexpl (talk) 20:02, 14 October 2024 (UTC)
- I was not saying it's illegal. That may be fine according to law. I wonder if it's fine to Commons that users' contributions are just blacked out and not available to people. Prototyperspective (talk) 21:39, 14 October 2024 (UTC)
- Huge filesizes for photos are a cost factor when it comes to processing and are almost never worth it anyway. I dont blame them from not wanting photos with the megabytes in the three digits to show up, whenever somebody types in a generic searchterm. Alexpl (talk) 14:13, 15 October 2024 (UTC)
- This seems offtopic. 1. Most files on WMC are not many MBs large and this is not about some particular few large files. 2. It only shows gstatic thumbnails in Google Search, not the whole image, and it's the same for DDG and other search engines.
It's absurd to argue that Google's storage or processing would have notable issues that out of the millions of indexed website makes WMC one whose media is not findable.
You can of course defend anti-WMC practices – despite that I don't understand why Commons contributors could be supportive of that – but this point does not make sense, partly because this isn't about the <0.1% of WMC files that are large image files to begin with. Prototyperspective (talk) 14:33, 15 October 2024 (UTC)- This is not the first time I have seen you try to dismiss comments with which you disagree as "off topic", when they are not. Please do not so that. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 15:46, 15 October 2024 (UTC)
- I said it seems offtopic and I did notdismiss the comment but address it comprehensively. When I say it seems offtopic that is for example because I may have misunderstood it and/or the user may want to clarify how it would be ontopic. I do wonder why you're so super sensitive about me using the word offtopic. The user did say something but did not explain how it relates to this subject and clarifying that with clear language is I think more constructive than beating around the bush. Prototyperspective (talk) 16:41, 15 October 2024 (UTC)
- This is not the first time I have seen you try to dismiss comments with which you disagree as "off topic", when they are not. Please do not so that. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 15:46, 15 October 2024 (UTC)
- There already is a thumbnail for every file here anyway so not even any need to create any anew. Prototyperspective (talk) 15:30, 15 October 2024 (UTC)
- This seems offtopic. 1. Most files on WMC are not many MBs large and this is not about some particular few large files. 2. It only shows gstatic thumbnails in Google Search, not the whole image, and it's the same for DDG and other search engines.
- Huge filesizes for photos are a cost factor when it comes to processing and are almost never worth it anyway. I dont blame them from not wanting photos with the megabytes in the three digits to show up, whenever somebody types in a generic searchterm. Alexpl (talk) 14:13, 15 October 2024 (UTC)
- I was not saying it's illegal. That may be fine according to law. I wonder if it's fine to Commons that users' contributions are just blacked out and not available to people. Prototyperspective (talk) 21:39, 14 October 2024 (UTC)
- See also meta:Talk:Community Wishlist/Wishes/Do something about Google & DuckDuckGo search not indexing media files and categories on Commons. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 20:41, 14 October 2024 (UTC)
- There is a commercial interest in steering the search results to commercial and social websites. These generate clicks, not the commons. I do have the impression that Google is much more interested in SDC of files than the Commons categories. Every effort should be made to fill in the P:P180. Google certainly uses the labels in Wikidata as datafeed for the search engines. Also used for educating the translation software.Smiley.toerist (talk) 10:12, 15 October 2024 (UTC)
- Wikipedia itself is indexed rather highly on Google search results though. And it does index images that are used in Wikipedia articles, but this treatment isn't extended to the other Wikimedia projects. (I can't speak for other media files however). ReneeWrites (talk) 18:26, 15 October 2024 (UTC)
- Yes Wikipedia is, but not Commons, the second largest Wikimedia project with a type of content that lots of people are interested in, watch and search for (media of all kinds). It does not index any video on here (at least in my tests I could not find any so far even when searching for the exact title) and images I think are only indexed when they're used in Wikipedia articles and even then often missing from the main results. One part of the proposal is systematic tests/investigations so there is some data on this. I think overall the indexing is pretty bad even when one is searching for a subject that WMC has lots of high quality contents and other image results that are shown are fairly low-quality. One could also focus on the videos. Prototyperspective (talk) 20:32, 15 October 2024 (UTC)
- Google often indexes images that are not in a Wikipedia article. I find plenty if I do specifically an image search. But it doesn't tend to list pages that are mainly an image in its general results, so Commons image pages often don't show in the result if you do a general Google search. - Jmabel ! talk 05:11, 16 October 2024 (UTC)
- Rarely it does, but indexing a random tiny subset of files doesn't change anything about the issue and only makes it harder to notice this. I did not find plenty of images for prior searches I did where I then either used an image not from WMC despite that I know WMC has at least as good images well-organized or used the WMC search. Again, investigations are the first step of what is proposed so maybe you could share your searches. Images certainly shouldn't show up in the general search results (well nearly always) – I made it clear that this is about the Images and Videos tabs of these sites...only when it comes to category pages is this about the general search results. I currently don't have many good examples. Things I searched for (those may not be the best examples) I think included roughly
Rivers from space
andAlgae blooms from space
andSatellite picture of cities at night
. This is not about Google&DDG not indexing any files on WMC. Please let me know if that should be clearer in the proposal. It is about them indexing only very few images (and those are not even the most relevant or best) when it should be many (e.g. in searches where WMC has lots of good-organized files), not showing nearly all categories in the results and not indexing any videos. Maybe it should be clearer that isn't necessarily all Google's fault – the investigations may reveal things Wikimedia community & tech could do to improve its inclusion in external search results – however such steps depend on investigations and don't mean step 2 & 3 are invalid, other things could follow up on that step in addition and shape these two. Prototyperspective (talk) 11:30, 16 October 2024 (UTC)- @Prototyperspective: Colourpicture Publishers. There isn't that many results to begin with, but maybe it's at the top because the category has a description that contains the companies name in it? --Adamant1 (talk) 01:21, 18 October 2024 (UTC)
- Yes, that's the kind of investigations I'm proposing are done large scale and in systematic ways (and well visibly e.g. published in diff) so we can identify cases that are well indexed, find out why, and identify cases that should be well-indexed but aren't and so on.
- It could be that it's at the top because it contains a long descriptive category description – which most cats however don't really need because the category title is self-explanatory – as well as an infobox with all sorts of data. It's not unlikely also because there's few other websites with info on that subject, especially not recent ones that are linked from other pages. As a result of findings like your example, one could for example conduct tests (and/or check the theory via the dataset) whether it's the company's name in the description that caused the cat to show up this high or the description and consider things like adding category-descriptions (partly automatically via WP article leads and/or Wikidata item description). An open letter doesn't have to be as provocative and confrontational as the title of this thread, one could nicely ask Google & Co to improve their results by considering specific things or identified requested changes. Relevant to that is that Google & Co heavily make use of Wikimedia content in all sorts of ways but this isn't about fairly giving back (some media attention however could be due to that and reference that): it would be about them improving their search results for everyone so it shows media or pages that the person searching would likely find useful (e.g. via considering how many files and how many Wikipedia-used files are contained in the category). (When it comes to videos however it seems like purposeful exclusion.) Prototyperspective (talk) 08:24, 18 October 2024 (UTC)
- @Prototyperspective: Colourpicture Publishers. There isn't that many results to begin with, but maybe it's at the top because the category has a description that contains the companies name in it? --Adamant1 (talk) 01:21, 18 October 2024 (UTC)
- Rarely it does, but indexing a random tiny subset of files doesn't change anything about the issue and only makes it harder to notice this. I did not find plenty of images for prior searches I did where I then either used an image not from WMC despite that I know WMC has at least as good images well-organized or used the WMC search. Again, investigations are the first step of what is proposed so maybe you could share your searches. Images certainly shouldn't show up in the general search results (well nearly always) – I made it clear that this is about the Images and Videos tabs of these sites...only when it comes to category pages is this about the general search results. I currently don't have many good examples. Things I searched for (those may not be the best examples) I think included roughly
- Google often indexes images that are not in a Wikipedia article. I find plenty if I do specifically an image search. But it doesn't tend to list pages that are mainly an image in its general results, so Commons image pages often don't show in the result if you do a general Google search. - Jmabel ! talk 05:11, 16 October 2024 (UTC)
- Yes Wikipedia is, but not Commons, the second largest Wikimedia project with a type of content that lots of people are interested in, watch and search for (media of all kinds). It does not index any video on here (at least in my tests I could not find any so far even when searching for the exact title) and images I think are only indexed when they're used in Wikipedia articles and even then often missing from the main results. One part of the proposal is systematic tests/investigations so there is some data on this. I think overall the indexing is pretty bad even when one is searching for a subject that WMC has lots of high quality contents and other image results that are shown are fairly low-quality. One could also focus on the videos. Prototyperspective (talk) 20:32, 15 October 2024 (UTC)
- Wikipedia itself is indexed rather highly on Google search results though. And it does index images that are used in Wikipedia articles, but this treatment isn't extended to the other Wikimedia projects. (I can't speak for other media files however). ReneeWrites (talk) 18:26, 15 October 2024 (UTC)
- Google clearly does take these images into account. I looked up a handful of terms:
Google Images searches |
---|
|
If you narrow your search to CC images, you get more from Flickr and Commons:
Google Images searches - Narrowed to Creative Commons |
---|
|
I don't believe there even is a problem. Sure, results from WMF projects are only 1 or 2 in many cases, but:
- it's not like there was any other site that did have a majority of the top results
- you can improve them by searching for CC content
- Wikipedia was almost always in the results, even if they didn't have a majority in the top images (which there's no reason it should, might I add). I can't say the same about other results I saw, like Britannica, NatGeo, Adobe Stock, etc.
- Google is showing results from Wikipedia, Commons, and even smaller projects like Wikispecies and Wikivoyage, at times .I wouldn't put it past them that they're prioritizing commercial and social sites that run Google Ads (purely speculation from my part, don't take my word for it), but I find it hard to believe that they're straight up censoring, shadowbanning, or otherwise limiting results from WMF projects. Rubýñ (Scold) 17:21, 15 October 2024 (UTC)
- I haven't repeated all the searches to test this, but with the ones I did I only got 1 result from WMF, and it was the image in the infobox of the Wikipedia article about the subject. ReneeWrites (talk) 20:29, 15 October 2024 (UTC)
- I personally use Ecosia to search things and I often just type in something in Ecosia rather than search it here because I am too lazy to use the convoluted Wikimedia internal search method (yes, using external websites to find something is oftentimes easy than the internal "search" engines on Wikimedia websites), but I noticed that in the past few months Ecosia has been suppressing non-Wikipedia Wikimedia websites more, now, this seems to coincide with the switch where Ecosia now mixes in Google Search search results with those from Microsoft Bing, before this change Ecosia exclusively used Microsoft Bing and while I've used Microsoft Bing as my main search enginge since 2011~2012'ish, I switched to Ecosia a couple of years ago (after I saw one of their advertisements on Google YouTube) and I occasionally compare it with Google Search and other search engines. Judging by the fact that Google Search suppresses Wikimedia Commons and Microsoft Bing does this to a lesser extent I assume that this likely is a deliberate choice by those companies. But it could probably also be something internal at Wikimedia websites as all non-article space pages at Wikipedia are also excluded from search engines (meaning that someone cannot find any Wikipedia policy pages unless someone looks for them within Wikipedia, which I've always found to be a rather odd choice).
- Now, we know that Google Search, Microsoft Bing, Ecosia, DuckDuckGo, Yahoo! Search, Etc. all heavily rely on Wikidata, perhaps linking all Wikimedia Commons category pages with Wikidata items might help integrate this website better with search engines, if you think about it, the exclusion of the Wikimedia Commons is exclusively the exclusion of the Wikimedia Commons, I have no trouble finding results from the Wiktionary or Wikivoyage, which probably means that the integration between Wikidata and other Wikimedia websites helps them. Now, I know that "SEO" is considered "a curse word among Wikimedians", but if we want the Wikimedia Commons to show up in search results we most likely do need to link to Wikidata and properly use redirects, alternative titles, translations, Etc. in a way that makes sense. For example, if you search for alternative titles on Wikipedia you get them, like "Communist Germany" in a search enginge you'll find the DDR because "Communist Germany" is a redirect at Wikipedia. Meanwhile, we tend to have highly specific titles and redirects are typically deleted. But my guess is that the main culprit is the lack of Wikidata integration at the Wikimedia Commons, I wonder if files with more optimised structured data also show up in search engine results more as these are dependent on Wikidata items. Alternatively, we could compare if categories with or without Wikidata integration show up more in internet search enginges. --Donald Trung 『徵國單』 (No Fake News 💬) (WikiProject Numismatics 💴) (Articles 📚) 18:52, 19 October 2024 (UTC)
- Thanks for this interesting info contribution.
- Comparing indexing results between search engines like so and across time (especially after algorithms were reported to be changed albeit it's often probably not announced) could help identify causes and potential mitigation measures.
- I never noticed or thought about search engines not indexing policy and meta pages of Wikimedia sites (nonWMC), if so that's also I think something that would be good to be changed if possible. For example, new editors or readers may search for these with a search engine instead of the internal one. If they searched for a meta/help pages on Commons it's often quite possible they can't find it because they don't show up in the search results even when in the MediaSearch' Categories and Pages tab (issue #8 here).
- [Google & Co] all heavily rely on Wikidata that good integration with Wikidata is a cause for SE indexing or good indexing and that improving that integration are two hypotheses that could be tested. I do not think this is the case much because category pages that are linked to Wikidata items also do not show up and only a tiny sub < 0,01% of files are used in Wikidata items or usable there while most items are somewhere underneath a category that is linked to Wikidata item. I think 'it's not linked to a Wikidata item' or 'it doesn't have structured data depicts statements' would be not much more than false excuses (not necessarily deliberate) for not indexing and I don't see why it would rely on / require it / why it should be expected. Moreover, some categories should probably be well-indexed without being linked to a Wikidata item or linking such would be inappropriate or at least can't be done at scale(?) – e.g. Category:Drone videos with lots of organized content can't even be found in DuckDuckGo when searching for
drone videos wiki
(btw I think it should also show up high for searches likefree drone videos
). The linked proposal however is interesting but I have doubts this can be done both at scale and affects the SE much. Data suggesting such as has any significant effect is also missing. So I don't think it would solve this, e.g. videos on WMC still don't show up in the videos tab and many large categories are already linked. - and properly use redirects, alternative titles, translations, Etc. in a way that makes sense Agree. One option is to sync ENWP redirects of items to WMC so WMC has the same redirects [ie a tool for doing so]. Another is Adding machine translated category titles and this could also be implemented via redirects and be extended to category descriptions. This however is another case that I don't think should be required for the pages to show up in search results but only improve them. It's possible that this would solve this even if it shouldn't be that way due to how pages are ranked. Note that this may require that the category page is an actual url with an actual title and not not the same url with some Javascript dynamically changing the title depending on the user language. Another option of creating redirects of translated titles – Category:Tiere (de; only plural form not singular) currently redirects to Category:Animals – can't be done at scale and may cause issues (such as HotCat autocompletes).
- In any case such comparison data would be great even if it's just a small factor (I doubt it's the main culprit for the plural indexing issues).
- Prototyperspective (talk) 20:03, 19 October 2024 (UTC)
- From everything I've been able to tell, Google does index pages in "Commons" space. For example, do a Google search on "structured data commons" (no quotes). - Jmabel ! talk 16:43, 20 October 2024 (UTC)
- Yes, this is known, e.g. the intro already is about "most" files, not "all" files as well as results' ranking/findability. I've yet got to see a WMC video in the videos tab however. Prototyperspective (talk) 16:46, 20 October 2024 (UTC)
- Sorry I misunderstood your comment Jmabel – it's addressing point #2 and you're right on that.
- Some examples of low-views useful major categories below. Please comment if anybody knows more in regards to why Videos on WMC are not showing in the Videos tab of Google, DuckDuckGo, etc. Maybe one could ask them or see if there's any other large websites whose videos are not shown there (and why).
- Yes, this is known, e.g. the intro already is about "most" files, not "all" files as well as results' ranking/findability. I've yet got to see a WMC video in the videos tab however. Prototyperspective (talk) 16:46, 20 October 2024 (UTC)
- From everything I've been able to tell, Google does index pages in "Commons" space. For example, do a Google search on "structured data commons" (no quotes). - Jmabel ! talk 16:43, 20 October 2024 (UTC)
- Thanks for this interesting info contribution.
- Prototyperspective (talk) 17:23, 26 October 2024 (UTC)
- The 14th most viewed page and the second most viewed category on Commons [1] in also a video category [2]. Views on all Commons pages are quit low there is nothing special with videos on Commons. GPSLeo (talk) 19:13, 26 October 2024 (UTC)
- Yes, even Commons pages with most view get few views which is consistent with the problem description in the proposal. I did not suggest there was something special with videos except that none of them are shown in and indexed in the videos tab of the search engines. Prototyperspective (talk) 19:29, 26 October 2024 (UTC)
- The 14th most viewed page and the second most viewed category on Commons [1] in also a video category [2]. Views on all Commons pages are quit low there is nothing special with videos on Commons. GPSLeo (talk) 19:13, 26 October 2024 (UTC)
- Prototyperspective (talk) 17:23, 26 October 2024 (UTC)
- It's a good thing, if Google keeps us a relative secret. This is a databank for a select audience, that’s hopefully using items for creating content, or research. It's not a social media website for easy access to every airhead in creation, we don't need the level of vandalism, that would surely follow.
- As a matter of fact, we scavenge off commercial websites, without them, we would have limited access to new materiel. It would be detrimental, to attempt to replace them, no good would come of it. Broichmore (talk) 12:26, 29 October 2024 (UTC)
- Even for "select audience" it's known, used and discoverable far too little. They also use the Videos tab for example. Moreover, I do not agree with this elitism. Free media and free knowledge is about society overall not some very small group. With increased use, there would also be increased contributors who watch pages and Wikipedia is used much more and is not overrun by vandalism, it probably doesn't increase linearly with increased public use and even if it would there can be and are technological means to detect vandalism. The site would not replace commercial websites even if far more popular. I do not agree that we scavenge off these either. Prototyperspective (talk) 12:54, 29 October 2024 (UTC)
- So, to wrap this up: you want to upload stuff on Commons and have it shown in google´s services in a predictable way. This would only make sense for either advertising or some sort of campaigning and that is "no bueno". Alexpl (talk) 15:43, 30 October 2024 (UTC)
- No this doesn't wrap it up at all and it's entirely unrelated to advertising or some sort of ad-like campaigning. It's also not about a "predictable way". Prototyperspective (talk) 16:03, 30 October 2024 (UTC)
- Sure. Alexpl (talk) 18:30, 31 October 2024 (UTC)
- Its to bad the Phabricator ticket is stalled out. It doesn't seem like anything else can be done about it outside of that though. --Adamant1 (talk) 19:15, 31 October 2024 (UTC)
- I named three specific things in the linked proposal. These things can be done. Prototyperspective (talk) 21:11, 31 October 2024 (UTC)
- Sure, but I was specifically referring to this discussion. Not suggestions you've made in other proposals. Can anything be done about it in this conversation? Probably not. Can things be done about in other conversations or places? Maybe. But I'm not replying to someone else in another conversation now am I? --Adamant1 (talk) 21:34, 31 October 2024 (UTC)
- I named three specific things in the linked proposal. These things can be done. Prototyperspective (talk) 21:11, 31 October 2024 (UTC)
- Its to bad the Phabricator ticket is stalled out. It doesn't seem like anything else can be done about it outside of that though. --Adamant1 (talk) 19:15, 31 October 2024 (UTC)
- Sure. Alexpl (talk) 18:30, 31 October 2024 (UTC)
- I don't think it's appropriate (let alone necessary) to make assumptions about why someone would support this initiative, especially if those assumptions are going to be bad ones. For my part I just like the information I add to these projects (whether this is Commons or Wikipedia itself) to be findable, but the difference between how the Google search engine treats these two projects is night and day. ReneeWrites (talk) 15:57, 3 November 2024 (UTC)
- No this doesn't wrap it up at all and it's entirely unrelated to advertising or some sort of ad-like campaigning. It's also not about a "predictable way". Prototyperspective (talk) 16:03, 30 October 2024 (UTC)
- So, to wrap this up: you want to upload stuff on Commons and have it shown in google´s services in a predictable way. This would only make sense for either advertising or some sort of campaigning and that is "no bueno". Alexpl (talk) 15:43, 30 October 2024 (UTC)
- Even for "select audience" it's known, used and discoverable far too little. They also use the Videos tab for example. Moreover, I do not agree with this elitism. Free media and free knowledge is about society overall not some very small group. With increased use, there would also be increased contributors who watch pages and Wikipedia is used much more and is not overrun by vandalism, it probably doesn't increase linearly with increased public use and even if it would there can be and are technological means to detect vandalism. The site would not replace commercial websites even if far more popular. I do not agree that we scavenge off these either. Prototyperspective (talk) 12:54, 29 October 2024 (UTC)
October 27
I messed up making a mass deletion request
How do I fix it? I edited the template page instead of making a new request by accident. — Preceding unsigned comment added by TansoShoshen (talk • contribs) 21:05, 27 October 2024 (UTC)
- @TansoShoshen: I've deleted the template page; you can go ahead and re-make your request. Normally you would have gotten an error, since the page is create-protected. However, admin FunkMonk had recently made the same error, so the page happened to exist and you could edit it. Pi.1415926535 (talk) 22:21, 27 October 2024 (UTC)
- @TansoShoshen consider using com:vfc for mass requests. RoyZuo (talk) 18:57, 1 November 2024 (UTC)
October 28
Flickr license and license in embedded metadata differ
The given image is currently licensed CC‑BY‑2.0 (generic). But the image metadata clearly states CC‑BY‑4.0. Should the licensing here be changed? I prefer the information in the metadata myself. Also, the version 2.0 licenses are at least a decade stale and legally deficient in several respects.
In addition, I can easily contact the copyright holder and gain explicit permission for CC‑BY‑4.0 should that be necessary.
Thanks in advance. RobbieIanMorrison (talk) 12:21, 28 October 2024 (UTC)
- Sorry, I did not realize the item for "copyright" at the top of this page is clickable and not an indicator. But I'll leave this posting here nonetheless. (It would be more intuitive to have little subtabs at the top and not just colored text, a hint!) RobbieIanMorrison (talk) 12:27, 28 October 2024 (UTC)
- Afaik, it is important what is written in the license template. The problem is that data in the metadata might become obsolete due to changes, or the metadata is automated for all works by a photographer. Sometimes we have an upload to Flickr with an NC license, but the author decides to change to CC BY. Then the metadata does not reflect recent changes, but in fact, there are some. (Another example: When a photographer uploads his image to Commons, but has a NC license stated in the metadata, it becomes obselete when he declares to publish his work under a CC BY license, for example). There are also many cases where the metadata states that the respective image must not be used without permission by the photographer, but since then, usage rights were transferred to another institution and they released the image under a free license, but the metadata does not reflect these recent changes --PantheraLeo1359531 😺 (talk) 08:00, 29 October 2024 (UTC)
- In this case, it is interesting whether a usage under the conditions of version 2 AND 4 is allowed, as the license only vary in the versions, not the restrictions necessarily --PantheraLeo1359531 😺 (talk) 08:04, 29 October 2024 (UTC)
- @PantheraLeo1359531: No easy answer, I guess, in terms of workflows. The tags embedded in the file can easily become obsolete. But — I would strongly argue — that the most liberal license present should still take precedence. And I would suggest that the CC‑BY‑4.0 license is the most liberal with its grant of 96/9/EC database rights. So returning to my original question, I believe the license notice on Wikimedia should be modified to version 4.0. I am going to get technical here, so feel free to stop reading! The SPDX
AND
logical conjunction operator requires that recipients simultaneously comply with the terms of both or all listed licenses. This is correct, AFAIK, in your example because CC‑BY‑4.0 is simply more permissive than CC‑BY‑2.0. In short, CC‑BY‑2.0 is forward/inbound compatible to CC‑BY‑4.0 (my best info using a quick search was this). Noting also that the CC‑BY‑2.0 does not contain the "or later" version language that some software licenses do. Thanks for your reply. RobbieIanMorrison (talk) 09:16, 29 October 2024 (UTC)- Thank you for you answer! I am not an expert to the license details, so I cannot examine further what to do :). Greetings --PantheraLeo1359531 😺 (talk) 09:31, 29 October 2024 (UTC)
- I spend quite a lot of time advocating for en:open data. RobbieIanMorrison (talk) 10:13, 29 October 2024 (UTC)
- Thank you for you answer! I am not an expert to the license details, so I cannot examine further what to do :). Greetings --PantheraLeo1359531 😺 (talk) 09:31, 29 October 2024 (UTC)
- If they offer two versions of the same named license, any reuser can select whichever they prefer. Just like any other multi-licensing. - Jmabel ! talk 03:40, 30 October 2024 (UTC)
- @Jmabel: Thanks. Essentially the SPDX
OR
logical disjunction operator if a need to be explicit was sought. That was not my question. My question was should that image stored on Wikimedia be tagged as CC‑BY‑4.0 and not CC‑BY‑2.0 — version 4.0 being the more favorable license for several reasons (universal, database rights grant, contemporary)? RobbieIanMorrison (talk) 11:46, 30 October 2024 (UTC)- @RobbieIanMorrison: No, it should be tagged as both. Generally, this is done as a vertical stack. — 🇺🇦Jeff G. ツ please ping or talk to me🇺🇦 14:58, 30 October 2024 (UTC)
- @Jmabel: That is a sensible approach. Some only obliquely relevant comments follow. Why does Flickr apply CC‑BY‑2.0 on images shot in 2022? I could not find a definitive source for CC‑BY‑2.0 being forward compatible to CC‑BY‑4.0. And I spoke to the photographer of the image under discussion recently and he said he would reissue any of his material for Wikipedia under CC‑BY‑4.0 on request (we often end up photographing the same climate protests in Berlin). Thanks for your replies too. RobbieIanMorrison (talk) 15:26, 30 October 2024 (UTC)
- @RobbieIanMorrison: Flickr never updated this aspect of their offered licensing. I have no solid idea why they have made that choice; most likely the defaulted into lack of change by not addressing the issue. But you'd really have to ask someone at Flickr why Flickr made a particular decision; I certainly can't speak for them. - Jmabel ! talk 17:42, 30 October 2024 (UTC)
- This issue was already addressed on Flickr. It seems that they just don't care. [3] [4] Herbert Ortner (talk) 20:37, 30 October 2024 (UTC)
- @Herbert Ortner: Thanks. Some discussion about file‑specific embedded licenses versus site licenses in that last URL. RobbieIanMorrison (talk) 21:56, 30 October 2024 (UTC)
- This issue was already addressed on Flickr. It seems that they just don't care. [3] [4] Herbert Ortner (talk) 20:37, 30 October 2024 (UTC)
- @RobbieIanMorrison: Flickr never updated this aspect of their offered licensing. I have no solid idea why they have made that choice; most likely the defaulted into lack of change by not addressing the issue. But you'd really have to ask someone at Flickr why Flickr made a particular decision; I certainly can't speak for them. - Jmabel ! talk 17:42, 30 October 2024 (UTC)
- @Jmabel: That is a sensible approach. Some only obliquely relevant comments follow. Why does Flickr apply CC‑BY‑2.0 on images shot in 2022? I could not find a definitive source for CC‑BY‑2.0 being forward compatible to CC‑BY‑4.0. And I spoke to the photographer of the image under discussion recently and he said he would reissue any of his material for Wikipedia under CC‑BY‑4.0 on request (we often end up photographing the same climate protests in Berlin). Thanks for your replies too. RobbieIanMorrison (talk) 15:26, 30 October 2024 (UTC)
- @RobbieIanMorrison: No, it should be tagged as both. Generally, this is done as a vertical stack. — 🇺🇦Jeff G. ツ please ping or talk to me🇺🇦 14:58, 30 October 2024 (UTC)
- @Jmabel: Thanks. Essentially the SPDX
- @PantheraLeo1359531: No easy answer, I guess, in terms of workflows. The tags embedded in the file can easily become obsolete. But — I would strongly argue — that the most liberal license present should still take precedence. And I would suggest that the CC‑BY‑4.0 license is the most liberal with its grant of 96/9/EC database rights. So returning to my original question, I believe the license notice on Wikimedia should be modified to version 4.0. I am going to get technical here, so feel free to stop reading! The SPDX
- In this case, it is interesting whether a usage under the conditions of version 2 AND 4 is allowed, as the license only vary in the versions, not the restrictions necessarily --PantheraLeo1359531 😺 (talk) 08:04, 29 October 2024 (UTC)
- Afaik, it is important what is written in the license template. The problem is that data in the metadata might become obsolete due to changes, or the metadata is automated for all works by a photographer. Sometimes we have an upload to Flickr with an NC license, but the author decides to change to CC BY. Then the metadata does not reflect recent changes, but in fact, there are some. (Another example: When a photographer uploads his image to Commons, but has a NC license stated in the metadata, it becomes obselete when he declares to publish his work under a CC BY license, for example). There are also many cases where the metadata states that the respective image must not be used without permission by the photographer, but since then, usage rights were transferred to another institution and they released the image under a free license, but the metadata does not reflect these recent changes --PantheraLeo1359531 😺 (talk) 08:00, 29 October 2024 (UTC)
October 29
Your input...
FYI: Commons talk:Administrators#Userpages: red or blue? Regards, Aafi (talk) 09:39, 29 October 2024 (UTC)
- "[Administrator should have a] user-page with .. [information] how they could be contacted": Is that a joke? Don't we have talk pages for that?
∞∞ Enhancing999 (talk) 10:14, 29 October 2024 (UTC)- @Enhancing999, I'm sorry if that sounded somewhat weird. I've made a change and tried to clarify what I exactly mean by it. You're free to comment on that discussion. I posted here for a wider community input and won't be monitoring any responses here. ─ Aafī on Mobile (talk) 10:23, 29 October 2024 (UTC)
- So my quote is no longer on that page. Ok.
- I wonder if user pages are read as much as some user hope ..
∞∞ Enhancing999 (talk) 10:27, 29 October 2024 (UTC)
- @Enhancing999, I'm sorry if that sounded somewhat weird. I've made a change and tried to clarify what I exactly mean by it. You're free to comment on that discussion. I posted here for a wider community input and won't be monitoring any responses here. ─ Aafī on Mobile (talk) 10:23, 29 October 2024 (UTC)
- Interestingly, there isn't much info on User:EugeneZelenko's user page (one of the admins/bureaucrats who asked for a user page to be created).
∞∞ Enhancing999 (talk) 10:43, 29 October 2024 (UTC)- Aren't language skills, user rights status and projects where user is participating/had participated completely useless? This seems bare minimum for me and I don't demand for something more. EugeneZelenko (talk) 14:34, 29 October 2024 (UTC)
- For user rights, the information is generally not complete and better left to the relevant MediaWiki function.
- Language skills should be visible on the talk page and most of the time, at least implicitly it is.
∞∞ Enhancing999 (talk) 19:34, 29 October 2024 (UTC)- Information from user talk page could be accidentally removed. For example links to archived talks were deleted couple of times from my talk page. Also archive bots could move it. So user page is something more persistent. EugeneZelenko (talk) 14:31, 30 October 2024 (UTC)
- Same could happen to a user page. Adding it directly to the talk page saves time.
∞∞ Enhancing999 (talk) 07:50, 1 November 2024 (UTC)- In contrast to user talk pages, I've never heard of a user page getting archived, and other people don't normally come along and edit your user page. So I think the distinction is valid here. - Jmabel ! talk 06:09, 3 November 2024 (UTC)
- Admins are expected to know how to configure archiving so that not everything gets archived. Never heard of problems with the archiving bot in this regards.
∞∞ Enhancing999 (talk) 09:48, 3 November 2024 (UTC)- "Admins are expected to know how to configure archiving…" Really? I'm an admin and have rather little idea how to set up archiving. I've had a couple of occasions to do it in 20 or so years, and looked up what I needed to know. Is this listed in the requirements for adminship somewhere? If so, I suppose I should learn it, but I'd be surprised if it is. - Jmabel ! talk 17:31, 3 November 2024 (UTC)
- It seems you figured it out. Most config things need checking each time one uses is at those thing tend to evolve.
- We wouldn't want to require to create user pages and add languages to user pages instead of where there are need: talk pages .. just because some admins wouldn't be able to manage their user talk page, wouldn't we?
∞∞ Enhancing999 (talk) 21:53, 3 November 2024 (UTC)
- "Admins are expected to know how to configure archiving…" Really? I'm an admin and have rather little idea how to set up archiving. I've had a couple of occasions to do it in 20 or so years, and looked up what I needed to know. Is this listed in the requirements for adminship somewhere? If so, I suppose I should learn it, but I'd be surprised if it is. - Jmabel ! talk 17:31, 3 November 2024 (UTC)
- Admins are expected to know how to configure archiving so that not everything gets archived. Never heard of problems with the archiving bot in this regards.
- In contrast to user talk pages, I've never heard of a user page getting archived, and other people don't normally come along and edit your user page. So I think the distinction is valid here. - Jmabel ! talk 06:09, 3 November 2024 (UTC)
- Same could happen to a user page. Adding it directly to the talk page saves time.
- Information from user talk page could be accidentally removed. For example links to archived talks were deleted couple of times from my talk page. Also archive bots could move it. So user page is something more persistent. EugeneZelenko (talk) 14:31, 30 October 2024 (UTC)
- Aren't language skills, user rights status and projects where user is participating/had participated completely useless? This seems bare minimum for me and I don't demand for something more. EugeneZelenko (talk) 14:34, 29 October 2024 (UTC)
Thanks again to the person who posted the link to https://ocr.wmcloud.org/ for me. I am rerunning news articles where Newspaper.com could not transcribe their own articles or could not properly distinguish the columns of material and jumbled the transcribed text. The Google OCR was able to transcribe the previously unreadable articles and even transcribed handwritten cursive writing. Thanks again. RAN (talk) 21:35, 29 October 2024 (UTC)
- Any suggestions on what to do?--Trade (talk) 21:28, 30 October 2024 (UTC)
- @Trade: What language is that? Setting it to Korean, it transcribes something, although it doesn't look quite right. I'd think with such a tiny amount of text it'd be easier to just type it, rather than using OCR at all! :) Sam Wilson 00:43, 31 October 2024 (UTC)
- That would require me to know Korean in the first place Trade (talk) 00:46, 31 October 2024 (UTC)
- @Trade: According to Google Translate it's "Nano Cola" in Korean, which makes sense. --Adamant1 (talk) 03:12, 31 October 2024 (UTC)
- That would require me to know Korean in the first place Trade (talk) 00:46, 31 October 2024 (UTC)
- Selecting the lower half gives a result. The tools seems mostly help for long texts, but still, it works even on this.
∞∞ Enhancing999 (talk) 21:33, 31 October 2024 (UTC)
- @Trade: What language is that? Setting it to Korean, it transcribes something, although it doesn't look quite right. I'd think with such a tiny amount of text it'd be easier to just type it, rather than using OCR at all! :) Sam Wilson 00:43, 31 October 2024 (UTC)
October 30
Views through mobile phones
Do we have a category for images like the one above? Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 09:51, 30 October 2024 (UTC)
- technically? Category:Mobile phone screenshots. Alexpl (talk) 14:52, 30 October 2024 (UTC)
- @Alexpl: Please use the colon trick per internal links to form Category:Mobile phone screenshots. — 🇺🇦Jeff G. ツ please ping or talk to me🇺🇦 18:42, 30 October 2024 (UTC)
- There's no category for that currently, but you can find enough images that depict something similar to justify creating a new category for it. I'm not sure what to call it though. ReneeWrites (talk) 20:21, 30 October 2024 (UTC)
- No, there is no category for that. Checked this by searching for images like this and opening some that are of the same kind which all do not have such a category set. I doubt such a category would be due or useful though but it could be created. I think it would be useful if the phones showed augmented reality but there's very few photos of that kind and it would be a subcategory of Category:Augmented reality. Prototyperspective (talk) 10:41, 3 November 2024 (UTC)
If I make such a category, does anyone have any thoughts as to what we should call it? Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 12:19, 3 November 2024 (UTC)
- "Photos with views through mobile phones". Prototyperspective (talk) 12:28, 3 November 2024 (UTC)
There seems to be some inconsistency between the use of the term "music groups" and "musical groups". Anyone know which is correct?--Trade (talk) 21:05, 30 October 2024 (UTC)
- Both are perfectly valid English. - Jmabel ! talk 21:22, 30 October 2024 (UTC)
- Maybe but arbitrarly having one set of categories use one spelling and another a different spelling makes for a complete mess. Trade (talk) 21:26, 30 October 2024 (UTC)
- Fine, but the choice is arbitrary. You asked which was "correct", and both are acceptable English. - Jmabel ! talk 01:01, 31 October 2024 (UTC)
- FWIW, "music groups" may be easier for non-native speakers, since we would refer to "rock music groups" and "jazz music groups", not "rock musical groups" and "jazz musical groups". - Jmabel ! talk 01:02, 31 October 2024 (UTC)
- Maybe but arbitrarly having one set of categories use one spelling and another a different spelling makes for a complete mess. Trade (talk) 21:26, 30 October 2024 (UTC)
October 31
Almost 400k files need license review
I just did a search of Category:License review needed and subcategories and saw almost 400k files!!!
The result is that some of those files have been marked for review for years and the source die before anyone review the file. Then we have two choises:
- Mark the file for deletion (just like what is standard for recent files that fail upload)
- Keep the file
I'm sure reviewers feel tempted do skip such old files because it does not feel right to delete a file that could have been saved if it was reviewed right after the file was uploaded.
The good news is that many of those files might actually not need a "normal" review to confirm the license. For example a bot can verify a video have the right license but it can't check if there are any derivative work in the video. So it might help if we somehow could sort the files in those that urgently need a review and those that can wait. If anyone have ideas feel free to fix the problem.
If a file is checked 1 or 10 years after upload and no longer available we could create a template like {{Grandfathered old file}} that say that uploader claim the file is licensed freely but we can't verify that (now).
If we do so then we could move files that can't be reviewed from the normal review categories and hopefully it will be easier for reviewers to keep up with new uploads. It's like link rot. We can't fix what is allready broken but we can focus on new files.
Question is if that is an acceptable solution? Or does someone have a better idea? --MGA73 (talk) 16:04, 31 October 2024 (UTC)
- Delete the files. Otherwise, we create a playground for underworked attorneys to hassle Wikimedia/Foundation for years - before we ultimately have to delete those files anyway. Alexpl (talk) 16:55, 31 October 2024 (UTC)
- There is 30k+ files from Finna.fi which could be reviewed by software if somebody would like to write script which compares image to image in Finna and confirms that the licence is correct. I could even write script for that if somebody wants to run it. (note: I am participated to uploading the images). I suppose that there is other images uploaded from well formed repositories with API too which could be reviewed automatically too. --Zache (talk) 17:20, 31 October 2024 (UTC)
- I don't see how (all) files can/should be deleted as long as there is no obvious violation of guidelines or laws (and probably a huge amount of files is good (and several files are in use etc. etc.)) --PantheraLeo1359531 😺 (talk) 17:36, 31 October 2024 (UTC)
- Where exactly are those "400k" files? There are e.g. ~110,000 files in subcats of CAT:URAA (which includes +600 artist categories whose works are potentially affected by URAA paranoia), or ~130,000 files in CAT:PD-Art (PD-old default) (which are in 95% of cases obvious PD-old-70 or similar). There are 'only' 70,000 files using the actual {{LicenseReview}} template, and from my experience it dosen't seem to be the case that those files are more likely to be copyright violations than other any file on Commons (pretty much the opposite is the case). ~TheImaCow (talk) 17:56, 31 October 2024 (UTC)
- @TheImaCow: I agree that many files does not require an actual review but there are other review templates that LicenseReview. For example YouTube, Flickr and GODL-India. That is why I said it might help if we sort the categories in files that should be reviewed where someone confirm that the file is on some website with some license and files that need some other review were we do not need to compare the file to some website. --MGA73 (talk) 18:07, 31 October 2024 (UTC)
- Where exactly are those "400k" files? There are e.g. ~110,000 files in subcats of CAT:URAA (which includes +600 artist categories whose works are potentially affected by URAA paranoia), or ~130,000 files in CAT:PD-Art (PD-old default) (which are in 95% of cases obvious PD-old-70 or similar). There are 'only' 70,000 files using the actual {{LicenseReview}} template, and from my experience it dosen't seem to be the case that those files are more likely to be copyright violations than other any file on Commons (pretty much the opposite is the case). ~TheImaCow (talk) 17:56, 31 October 2024 (UTC)
- I don't see how (all) files can/should be deleted as long as there is no obvious violation of guidelines or laws (and probably a huge amount of files is good (and several files are in use etc. etc.)) --PantheraLeo1359531 😺 (talk) 17:36, 31 October 2024 (UTC)
- @Alexpl: underworked attorneys could have done that already if they want. Some of the file have been here for many years. If the files are uploaded by users with a good upload history I would not worry that much. If uploaded by someone with only one upload or with 10 uploads where 9 was deleted as copyvios I would worry much more. In any case if someone send a take down notice then I’m sure the file would be deleted even if it had a template saying file was claimed to be free but sadly not reviewed in time. --MGA73 (talk) 05:59, 1 November 2024 (UTC)
- A bot could identify files, that have a source, that is archived in archive.org or archive.is or both and add this information to the talk page of the file. Files without an archive version could get priority for review. --C.Suthorn (@Life_is@no-pony.farm - p7.ee/p) (talk) 07:05, 1 November 2024 (UTC)
- That is simply most (or so I think) files uploaded with video2commons for example. I don't know why you suggest deletion. They definitely should not be deleted just because somehow a license review tag was added. Most files simply do not have such a tag but are likewise not license reviewed, there is no reason for deleting files that have this template set. Once again I strongly disagree Alexpl but also I don't understand why he would even comment something like that.
- For license review, please prioritize those files that are in use. Various tools like GLAMorgan can be used to see files that are in use that are in category Category:License review needed. This tag / category is useful for that but maybe it should be used more sparingly, e.g. only for uploads by new users or a subset of video2commons uploads and/or the reviewing could be automated.
- Prototyperspective (talk) 12:02, 1 November 2024 (UTC)
- Here's one further idea: a link archival bot for external links on Commons (anywhere but especially in the source field of {{Information}}). There have been many requests & proposals for this in the Community Wishlists and so on but they are usually focused on Wikipedia. It seems like on Wikipedia lots of this is being done. Not so much on Commons except for vid2commons which seems to request an IA-archival for every video/audio import. This recent Wishlist proposal has "All projects" specified so its scope includes Commons; probably more could and should be done: Automatic Archiving of Cited Web Pages in Web Archive. Prototyperspective (talk) 17:27, 1 November 2024 (UTC)
Thank you for all the ideas. It would be great if they could be implemented. :-)
I mentioned a template earlier and I made an example of how it might look:
This image was originally posted to a website and claimed to be licensed under a free license. An administrator or reviewer <user> tried on the <date> to confirm that the above/below mentioned license was valid. However the file was not available on the specified source so the copyright status could not be confirmed. Administrator/reviewer found no indications that the copyright claim can't be trusted. If you disagree you can start a deletion request and state your reasons. |
I think such a template would be useful because it will make it possible to get the file away from the review category and at the same time it tell everyone that there is no reason to asking for a new review. --MGA73 (talk) 16:48, 1 November 2024 (UTC)
- Support such a template.
- we need a bot to go through files with a youtube source and test if the youtube source is ccby. when no, fail the review; when yes, mark it with a template that says something like "bot xx confirms that the given source youtubeURL is ccby" and auto categorises to a category "youtube files reviewed by bot". if a human reviews after the bot review, it gets categorised to "youtube files reviewed by bot and reviewer".
- we also need bots/some better automatic processes for all the iranian news photos.
- RoyZuo (talk) 18:50, 1 November 2024 (UTC)
- Re 2.: Agree. However, it's not so simple: often people upload videos they don't have rights for under CCBY or only mean the music is CCBY but not the video. Sometimes, a different license is specified in the file description but usually that's just CCBYSA or CCBY4.0 instead of CCBY3.0. Sometimes, a license may be specified in the description but not in the file metadata but I think this is an edge case that shouldn't be a problem. Lastly, some files were CCBY at the time of upload but had this changed later on or the video is down. In any case, I don't think most of these 400 k files are videos from youtube. Prototyperspective (talk) 19:08, 1 November 2024 (UTC)
- All the special cases can be handled in a DR started by the bot, or by the uploader replacing the failed review template with one that says "this youtube file fails bot review but is actually good so a human please review it".
- as long as a bot starts working and continues non stop, any new youtube uploads will be handled shortly after upload. then it's the uploader's responsibility to explain all those special cases (changed licence, taken down video...). if they cant do that in like 1 or 2 days after upload, the file deserves speedy deletion.
- https://commons.wikimedia.org/w/index.php?search=incategory:License_review_needed+youtube 17545 / 76125 = 23%. RoyZuo (talk) 19:31, 1 November 2024 (UTC)
- RoyZuo there are allready too many DR to handle. If a bot start thousands then the system will crash. I agree that files that fail a review shortly after upload should be deleted. But I think that a "no source" is better than a DR. --MGA73 (talk) 17:41, 3 November 2024 (UTC)
- Simple: rate-limit the bot to create 10 DR per month for old files (uploaded before the bot starts working). RoyZuo (talk) 19:38, 3 November 2024 (UTC)
- @RoyZuo: 10 DR per month is not even a drop in the bucket, certainly not a reason to use a bot. - Jmabel ! talk 19:41, 5 November 2024 (UTC)
- Simple: rate-limit the bot to create 10 DR per month for old files (uploaded before the bot starts working). RoyZuo (talk) 19:38, 3 November 2024 (UTC)
- I'm happy to design the templates, but i dont have the coding skills for the bot testing youtube url bit. RoyZuo (talk) 19:45, 3 November 2024 (UTC)
- I just noticed that it seems that the YouTubeReview template puts files in both Category:License review needed and Category:YouTube review needed. I think files should be in only one of the categories. --MGA73 (talk) 05:53, 4 November 2024 (UTC)
- RoyZuo there are allready too many DR to handle. If a bot start thousands then the system will crash. I agree that files that fail a review shortly after upload should be deleted. But I think that a "no source" is better than a DR. --MGA73 (talk) 17:41, 3 November 2024 (UTC)
- Re 2.: Agree. However, it's not so simple: often people upload videos they don't have rights for under CCBY or only mean the music is CCBY but not the video. Sometimes, a different license is specified in the file description but usually that's just CCBYSA or CCBY4.0 instead of CCBY3.0. Sometimes, a license may be specified in the description but not in the file metadata but I think this is an edge case that shouldn't be a problem. Lastly, some files were CCBY at the time of upload but had this changed later on or the video is down. In any case, I don't think most of these 400 k files are videos from youtube. Prototyperspective (talk) 19:08, 1 November 2024 (UTC)
- Comment There was an attempt earlier at Commons:Bots/Requests/EatchaBot 3 / Category:Arranged license review project to make review easier. I think it did help but it have now stopped. Maybe there are some ideas or code that can be of use for future bots. I also like the idea Zache mention about having a bot to confirm that files from Finna match the source. It is probably not possible to make one bot that can solve all problems but it will help if one or more bots can do some tasks and reduce the amount of files that humans have to work on. --MGA73 (talk) 19:40, 1 November 2024 (UTC)
- Comment there is certainly a real issue here, but I have no idea how it would best be addressed. In an awful lot of these cases, the original source is no longer available. - Jmabel ! talk 17:39, 3 November 2024 (UTC)
- There're 6k pd files https://commons.wikimedia.org/w/index.php?search=incategory:License_review_needed+PD . many of them are there probably because of User:ShakespeareFan00 https://commons.wikimedia.org/w/index.php?oldid=519632949 . RoyZuo (talk) 19:57, 3 November 2024 (UTC)
- Yes and that is a different type of review. Even if the source die it will not be a problem. --MGA73 (talk) 20:08, 3 November 2024 (UTC)
- Weird was that ever a publication known for featuring the names of the writers with a large portrait next to the articles?
∞∞ Enhancing999 (talk) 09:38, 4 November 2024 (UTC)- lol and I would have prefered that the review template was remove and the other one was kept. It is more specific. --MGA73 (talk) 14:22, 4 November 2024 (UTC)
November 01
Obtuse bot created categories
Apparently User:Gzen92Bot has been mass creating thousands of categories that only contain a couple of images and basing the names of the categories on the file names. Category:"Papier dominoté. Damier alternant le motif du dé, face cinq, un carré plein, deux carrés avec deux fleurs stylisées différentes, un carré avec un motif " géométrique ", sur fond vert pâle - btv1b10576326x being one of thousands of examples. People can look through Category:Files from Gallica needing categories (images) to find a ton more. Creating 20 word categories based on purely descriptive file names seems sub-suboptimal at best though. More so given that it's being done in mass and through automated editing. I'm not really sure what to do about it though since I'm not an expert on bots. Let alone am I even sure if it's an issue to begin with. But it does seem like a needlessly obtuse way to do things. So does anyone else have an opinion about it or know what can be done done to fix the issue assuming it even is one? --Adamant1 (talk) 04:51, 1 November 2024 (UTC)
- @Adamant1: I fully agree. Creation of >7,000 uncategorized and possibly-nonsense categories is not appropriate. Doubly so given that this does not seem to be an approved task for the bot. I have blocked the bot until/unless the task is approved.
- @Gzen92: This is the third time your bot has been blocked for operating with an unapproved task. Per Commons:Bots#Permission to run a bot, it is not optional to seek approval for bot tasks. Pi.1415926535 (talk) 05:46, 1 November 2024 (UTC)
- @Adamant1: As a regular user with some background in research data management, I completely agree as well. Thanks for pursuing the matter. RobbieIanMorrison (talk) 06:53, 1 November 2024 (UTC)
- Gee .. what's the cleanup plan for these?
∞∞ Enhancing999 (talk) 07:48, 1 November 2024 (UTC)- Please delete all the subcategories of Category:Files from Gallica needing categories (images). Prototyperspective (talk) 11:56, 1 November 2024 (UTC)
- Strong oppose towards such mass deletions. These categories appear to contain similar images, which can greatly aid the manual, proper catgorisation on commons - these categories may or may not be deleted if the images in them have been properly categorized. ~TheImaCow (talk) 16:24, 1 November 2024 (UTC)
- Most of them contain just 2 images. The files would be upmerged. Prototyperspective (talk) 17:20, 1 November 2024 (UTC)
- Strong oppose towards such mass deletions. These categories appear to contain similar images, which can greatly aid the manual, proper catgorisation on commons - these categories may or may not be deleted if the images in them have been properly categorized. ~TheImaCow (talk) 16:24, 1 November 2024 (UTC)
- Please delete all the subcategories of Category:Files from Gallica needing categories (images). Prototyperspective (talk) 11:56, 1 November 2024 (UTC)
- @Adamant1, Pi.1415926535, and Enhancing999: I continued uploading following Commons:Bots/Requests/Gzen92Bot-4, but I agree with the additional categories. I will make a new request (I will indicate the link here soon). This raises questions: there are millions of files to upload and it cannot be done manually, so from how many files should a category be created? How to name the categories (other than with the name of the file)? Following the decision I could easily empty the categories. Gzen92 (talk) 08:19, 1 November 2024 (UTC)
- If you are not able to categorize the photos properly when uploading such an amount of photos you should slow down the upload process and create them manually. GPSLeo (talk) 08:29, 1 November 2024 (UTC)
- Categorisation of images on Commons is not a requirement when uploading images & it shouldn't be - especially not for batch/GLAM uploads. A category such as "Images to check" is sufficient & often much better than automated categorisation. There are still thousands of content categories with random junk in them that was dumped there by automatic categorisation from ten years ago which needs to be cleaned up. A bunch of images, or also a bunch of 500,000 images waiting in a "to check/to categorize" category don't hurt anyone whatsoever, as opposed to poorly done automatic categorisation. ~TheImaCow (talk) 16:24, 1 November 2024 (UTC)
- I made the request. Gzen92 (talk) 17:26, 1 November 2024 (UTC)
- I'm not sure if it's practical in this case but the way I'd do it is to categorize the images by subject. For instance "maps from Gallica", "books from Gallica", Etc. Etc. Then people sub-categorize the images beyond that if they want to. But at least it doesn't lead to a bunch of random categories. --Adamant1 (talk) 18:42, 1 November 2024 (UTC)
- I made the request. Gzen92 (talk) 17:26, 1 November 2024 (UTC)
- Categorisation of images on Commons is not a requirement when uploading images & it shouldn't be - especially not for batch/GLAM uploads. A category such as "Images to check" is sufficient & often much better than automated categorisation. There are still thousands of content categories with random junk in them that was dumped there by automatic categorisation from ten years ago which needs to be cleaned up. A bunch of images, or also a bunch of 500,000 images waiting in a "to check/to categorize" category don't hurt anyone whatsoever, as opposed to poorly done automatic categorisation. ~TheImaCow (talk) 16:24, 1 November 2024 (UTC)
- If you are not able to categorize the photos properly when uploading such an amount of photos you should slow down the upload process and create them manually. GPSLeo (talk) 08:29, 1 November 2024 (UTC)
- Comment I'm not a fan of mass creation of categories with very few files in them (generally I do not like categories with very few files and I prefer to have 20 photos of John Doe in one category rather than to have 10 categories of John Doe in 2020, John Doe in 2021 or John Doe wearing a yellow hat looking west). But now they are created I agree with TheImaCow that it might be better to keep them untill better categories are created. --MGA73 (talk) 18:04, 1 November 2024 (UTC)
- At Commons:Bots/Requests/Gzen92Bot-6 there is now a discussion if the user should be trust to allow more uploads without categorization or cleanup of the current mess.
∞∞ Enhancing999 (talk) 10:46, 3 November 2024 (UTC)
Commons Gazette 2024-11
Volunteer staff changes
In October 2024, 1 sysop was elected. Currently, there are 180 sysops.
- User:Bastique was elected sysop (35/7/0) on 2 October.
Other news
- Results of Picture of the Year 2023 are out.
Edited by RoyZuo.
Commons Gazette is a monthly newsletter of the latest important news about Wikimedia Commons, edited by volunteers. You can also help with editing!
--RoyZuo (talk) 19:15, 1 November 2024 (UTC)
Derivative works (FOP etc.)
- does commons want derivative works (dw) that are currently not compatible with com:l, especially photos taken in no-FOP countries?
- were there users that got blocked for uploading such dw?
--RoyZuo (talk) 19:24, 1 November 2024 (UTC)
- Yes, they are wanted because one day they will be in the public domain. We hide the images and add an undelete date. There should be a mechanism in place where you can hide an image yourself and add the undelete date. --RAN (talk) 01:17, 2 November 2024 (UTC)
- I don't know if its neccessarily in line with the guidelines but I'm big proponent of people uploading uploading copyrighted works under the guise of documenting and theb deleting them with undeletion dates. At the end of the day this is as much about hosting documenting who created certain works and when they will become PD as it is a place to host freely licensed media. That's at least how I see it. There's no harm in uploading something purely to have it deleted so it can be restored once the copyright expires though. --Adamant1 (talk) 02:35, 2 November 2024 (UTC)
- Afaik, this topic or a similar one was already discussed. And uploading and then deleting sounds a bit circumstancial to me, but it would be very good if you could upload the file and set a publish date (especially for files with copyrighted content that soon will enter the public domain) :). But I strongly support the idea. --PantheraLeo1359531 😺 (talk) 10:09, 2 November 2024 (UTC)
- Just create a deletion request with the undeletion date. That's an easy way to do that. Yann (talk) 09:05, 3 November 2024 (UTC)
- I think the question one should read "FOP" instead of "no-FOP".
∞∞ Enhancing999 (talk) 10:39, 3 November 2024 (UTC)
November 02
I'd like a second opinion on the user's uploads. All the pictures seem to be AI-generated. When confronted on his talk page, he admitted to heavily editing one of the pictures. Since the subject of hos articles are lesser known (but notable) persons, I cannot confirm they actually represent the persons he claims they represent. Given this situation, do these pictures respect Commons inclusion policy? Strainu (talk) 10:19, 2 November 2024 (UTC)
Help needed with a new userbox template
Hi everyone!
I hope to receive your help with the template Template:User ISNI . The outputs should be as follows: the ISNI code in format like 0000 1111 2222 3333 on the Userbox (with spaces, because of Google indexation of ISNI codes), but the URL should be in this format https://isni.org/isni/0000111122223333 . So, my idea was that a user can input 4 groups of characters separately and the template logic would me it happen in terms of reaching the desirable output fortmat of ISNI code. I'm struggling to make it happen and would like to receive your helping hand, please. David Osipov (talk) 11:54, 2 November 2024 (UTC)
Provinces of China by month and year
Hello! I have created templates for the distribution of provinces of China by year and month - these are examples {{MonthinChinabyprovince}} and {{Chinaprovinceyear}}. Could you help with categorization in a short time frame and also check the templates? MasterRus21thCentury (talk) 15:07, 2 November 2024 (UTC)
- Is there any consensus for categorizing images by Chinese province by month and year? Trade (talk) 20:03, 2 November 2024 (UTC)
- We dont need new templates. use Template:Category description/Year by province. RoyZuo (talk) 23:58, 2 November 2024 (UTC)
November 03
Edit summary on project chat
Do we have a guideline that one should state which section one is replying to ? If not, should we have one? Commons:Talk page guidelines doesn't say much about it, but seems to concern itself more with user talk pages than with project chat (or noticeboards).
Personally, I find [5] problematic. The user does so regularly and insists on continuing doing so systematically.
∞∞ Enhancing999 (talk) 11:09, 3 November 2024 (UTC)
- The gadget should be changed so that includes the section link of the closed discussion. This has already been request on its talk page. I also think section links to closed discussions are useful. If subscribing to a thread one gets notified about any reply (and one can also see the section via the diff linked at t he Revision history) which makes this somewhat redundant but it would still be useful. Better than having a gadget for marking threads about issues as solved would be some native button to do so like there is for DiscussionTools that is used on MediaWiki talk pages.
- It's meta:User:DannyS712/EasyResolve. Prototyperspective (talk) 11:19, 3 November 2024 (UTC)
- That a gadget could be changed is not really relevant to the question about what we currently require. Also, as the change has been requested for a long time, it's unlikely it will be changed. In the meantime, one should limit its use to user talk pages.
∞∞ Enhancing999 (talk) 11:27, 3 November 2024 (UTC)- I think 1) the problem of not including section headers is not large enough for it to mean contributors should stop using it 2) many contributors often also edit without any edit summary or section header + there currently is no policy about such things and while it may be the case they should be requested to include such more often, they usually are not asked to change that 3) the benefits of this gadget outweigh. In addition, it is relevant to this discussion – I never said it was relevant to the question about what "we currently require". However, obviously it's also relevant to that. Prototyperspective (talk) 11:33, 3 November 2024 (UTC)
- Yes, it would be nice if the gadget did this, no it is not a big problem. - Jmabel ! talk 17:42, 3 November 2024 (UTC)
- Agreed with Jmabel. ReneeWrites (talk) 22:35, 3 November 2024 (UTC)
- Yes, it would be nice if the gadget did this, no it is not a big problem. - Jmabel ! talk 17:42, 3 November 2024 (UTC)
- I think 1) the problem of not including section headers is not large enough for it to mean contributors should stop using it 2) many contributors often also edit without any edit summary or section header + there currently is no policy about such things and while it may be the case they should be requested to include such more often, they usually are not asked to change that 3) the benefits of this gadget outweigh. In addition, it is relevant to this discussion – I never said it was relevant to the question about what "we currently require". However, obviously it's also relevant to that. Prototyperspective (talk) 11:33, 3 November 2024 (UTC)
- That a gadget could be changed is not really relevant to the question about what we currently require. Also, as the change has been requested for a long time, it's unlikely it will be changed. In the meantime, one should limit its use to user talk pages.
file description pages from IA Flickr stream
File description pages on these generally have extensive automated content, e.g. at this file there is:
- "Identifier, Title, Year, Authors, Subjects, Publisher, Contributing Library, Digitizing Sponsor, Text Appearing Before Image, Text Appearing After Image".
All without actually including the title of the image (included in the source, but vertically).
By default, this all gets added into the "description"-field of {{Information}}. I wonder if there wouldn't be a better place: a separate section and/or field.
∞∞ Enhancing999 (talk) 14:30, 3 November 2024 (UTC)
New page for establishing textured meshes on Commons
In 2018, Commons allowed to upload STL files for the first time. To extend the amount of types that can be uploaded, a new page for textured meshes was created. Perhaps one or the another is interested :)
Commons:Textured 3D --PantheraLeo1359531 😺 (talk) 18:16, 3 November 2024 (UTC)
November 04
FYI
For the next few weeks, I'm looking forward to nominating some kind Wikimedians from this project on m:Merchandise giveaways to appreciate their contributions. I nominated @Abzeronow yesterday and I am hopeful that his contributions are valued. You might want to take a look at at the nomination. Regards, Aafi (talk) 09:49, 4 November 2024 (UTC)
- Curious how much that cost? Aren't donations to WMF to run the servers and pay for MediaWiki developments?
∞∞ Enhancing999 (talk) 09:53, 4 November 2024 (UTC)- Perhaps check m:Wikimedia merchandise for this purpose. Regards, Aafi (talk) 10:07, 4 November 2024 (UTC)
- It doesn't say anything about the cost of not selling the merchandise and not spending the charity funds on fixing the misconfigured Commons upload function instead.
∞∞ Enhancing999 (talk) 10:16, 4 November 2024 (UTC)- The stability of uploads (on the server side) has been improved significantly in this year. (this allows more stable upload tools by users. I cannot comment on the Upload Wizard, i nearly never use the Wizard) C.Suthorn (@Life_is@no-pony.farm - p7.ee/p) (talk) 16:37, 5 November 2024 (UTC)
- It doesn't say anything about the cost of not selling the merchandise and not spending the charity funds on fixing the misconfigured Commons upload function instead.
- Perhaps check m:Wikimedia merchandise for this purpose. Regards, Aafi (talk) 10:07, 4 November 2024 (UTC)
- Making such an announcement while having a request for Oversight rights running is a bit odd as it looks like you would try to buy votes. GPSLeo (talk) 16:23, 4 November 2024 (UTC)
November 05
New law in Costa Rica: "Public Domain of Information"
Last Friday, November 1, 2024, Costa Rica’s official newspaper, La Gaceta, published Law 10.554, the "Framework Law on Access to Public Information". Pages 24-37.
Article 18 of this law establishes the following:
"ARTICLE 18 - Public Domain of Information
All materials produced by a public official in the course of their duties shall be considered in the public domain, except for personal data and without prejudice to the limits established in the Political Constitution of the Republic of Costa Rica, in international regulations approved by the Legislative Assembly, and in laws, in accordance with the principle of legal reservation."
I kindly request that a Wikimedia Commons administrator consider including this in the copyright policy. ¡Pura vida! LuchoCR (talk) 00:01, 5 November 2024 (UTC)
- Any idea whether this is retroactive? - Jmabel ! talk 19:45, 5 November 2024 (UTC)
Moscow State University Herbarium
Hi, I see that the Moscow State University Herbarium has images of its plants under a free license on its website. It would be useful to 1. add all images already uploaded to the source category. 2. license review all files. 3. mass upload all files not yet uploaded. This may requires writing a bot, and knowledge of botany (and may be Russian although the website is also available in English) is probably needed to properly categorize the images (Total items: 983,569). And more than that, apparently all images under [6] are under a free license. Yann (talk) 15:36, 5 November 2024 (UTC)