What is Hackers' Pub?

Hackers' Pub is a place for software engineers to share their knowledge and experience with each other. It's also an ActivityPub-enabled social network, so you can follow your favorite hackers in the fediverse and get their latest posts in your feed.

0

Please share - Pew Research Center is looking for a data archivist who will be an advocate for data users, helping to ensure that our datasets are easy to discover and reuse by researchers, journalists, and the public.
pewtrusts.wd5.myworkdayjobs.co

Data Archivist

Position Summary Pew Research Center is seeking a Data Archivist to support our commitment to open science and data transparency. This newly created role will play a key part in enhancing the accessibility, usability, and reproducibility of our research data while continuing to protect the privacy and identity of our survey participants. As Data Archivist, you will lead efforts to create and implement best practices for preparing, documenting, and disseminating datasets. These best practices should maximize FAIR (Findable, Accessible, Interoperable, and Reusable) principles while minimizing disclosure risk. You will work across teams to ensure our data is well-organized and thoroughly documented. You will serve as an internal advocate for data users, helping to ensure that our datasets are not only accurate and comprehensive but also easy to discover and reuse by researchers, journalists, and the public. This is a full-time, Pew Research Center position. The position is funded by an external grant and limited to a two-year term. Primary Responsibilities Develop optimization procedures to improve discoverability of our datasets on internal and external platforms Develop and maintain standards to improve accessibility of our microdata and tab plans by changing/adding formats and/or adding documentation Identify metadata documentation best practices and a process to implement those best practices at the Center Work with Legal to evaluate most appropriate license to publicly share the Center's survey data, including Creative Common options Identify and correct processing inefficiencies in our data publication process Sit on the internal Disclosure Risk Taskforce Document analytical decisions and code to support transparency and replicability, including the development of a RACI chart for publishing code to recreate derived variables that are used in reports but are not included in the microdata Manage/create merged time series datasets for select Center datasets Identify a process for internally archiving data and projects that are no longer in active use Identify and implement a process to assign Digital Object Identifiers (DOI) to microdata Prepare and upload public-facing datasets and restricted-use datasets for external sharing. Train staff on FAIR principles and best practices in data archiving. Education/Training/Experience Bachelor’s degree required, preferably in library sciences, organizational management, or a related field. 5-7 years of experience with data archiving, database management, or survey research. This may include graduate training at the MA/PhD level or equivalent experience in an applied setting. At least 3-5 years of experience applying FAIR and open science principles. Background in social science research or data curation. Experience in data management, archiving, or research support. Familiarity with FAIR principles, Creative Common licensing, data privacy principles, and exposure risk. Proficiency in metadata standards and documentation tools. Experience managing research projects, including working collaboratively with a multidisciplinary team. Experience with statistical software (e.g., R, Python, Stata) and reproducible research workflows. Knowledge, Skill and Workplace Requirements Strong organizational and communication skills. Detail oriented with exacting standards to maintain accuracy and impartiality in all work products. Ability to work independently to carry out special projects from start to finish. Ability to balance numerous tasks simultaneously. Ability to work collaboratively and collegially with other team members, as well as with staff from other Pew Research Center teams. Ability to balance competing priorities and identify optimal solutions FLSA Status: Exempt Compensation: Starting salary is commensurate with experience within the range of $100,000 - $120,000. Hybrid Work Schedule: Pew Research Center staff are required to be present in the Center’s Washington, D.C., office three core days weekly (Tuesday, Wednesday, Thursday). Staff may work virtually from remote locations on other days in a typical work week. Application Procedure: Click on the Apply button, and complete required fields. Both cover letter and resume are required. When requested, please upload a copy of your resume/cv, as well as a copy of your cover letter in the section labeled Resume/Cover Letter. If the documents have successfully uploaded, you should see 2 attached files beneath the “Drop files here” box. Please make sure you have uploaded a resume AND a cover letter before moving on to the next page. Total Rewards In addition to competitive pay, Pew Research Center’s employees enjoy a robust total rewards package that includes: Affordable, comprehensive health care that includes medical, dental (including adult orthodontia) and vision benefits. Generous paid annual leave plan, including a winter break between Dec. 25 and Jan. 1 Employer-paid disability, life insurance and paid family leave plans Up to a 12% employer 401(k) contribution, with vesting at the end of the first year. A 37.5-hour workweek. Health savings or flexible spending account options with employer funding component. Flexibility to telework a portion of each week, with an additional four telework “flex weeks” each year for most staff. EEO: Pew Research Center makes employment decisions without regard to age, sex, race, ethnicity, religion, disability, marital status, sexual orientation or gender identity, military/veteran status, or any other basis prohibited by applicable law. We encourage applications from candidates who represent a variety of backgrounds, perspectives, and skills. Pew Research Center is a great place to work, learn and grow. Our culture is open, collegial, collaborative, supportive and down-to-earth. Our staff is made up of more than 180 smart, talented, mission-driven people who care deeply about the work they do. We hire people from a wide variety of backgrounds, including social science researchers, data scientists, survey methodologists, journalists, graphic artists, web developers, communications professionals, and administrative support and operations staff. In our work we value independence, objectivity, accuracy, rigor, humility, transparency, and innovation.  An extension of these values is our vision of a positive, welcoming workplace built on respect, collaboration, openness, accountability, and community building – one where everyone can thrive and contribute to the mission.

pewtrusts.wd5.myworkdayjobs.com

0
0
0
0
0
0

RE: mastodon.social/@firefoxwebdev

A Killswitch should of course kill all ML/AI functionality and people could then reactivate certain specific features of they want to, it's really not that hard. Just cause you consider a feature"better" than others does not override consent practices.

0
1
0
0
0
0
0
0
0
0
0
1
0
0
0

ok, I think I'm done with this project?

it now plays conway's game of life and creepy noises, all from a single STM32G431 MCU with minimal external components, which is really all you can want from some lowfi analogue TV stuff.

In the process of making the audio work I ended up changing from channel 2 to channel 3 to get rid of the interference from the 48MHz clock I use for audio, which also means I'm running the CPU faster and get a little bit more processing time. Number of external components is now 4 resistors for mixing the baseband video signal and 5.5MHz audio carrier. Unfortunately going to the higher frequency also meant losing some signal quality, since I'm operating the internal opamp even more out of spec. Or maybe I'm just unlucky and there is more interference on that frequency here.

0
0
0
0
0
0
0
0

I have been trying to figure out what to post here about ICE activity today in the Minneapolis / St. Paul metro. It is just •batshit• out there.

Some admin mucky-muck is in town today (Noem, I think?), and ICE is putting on a full We Are A Big Fascist Deal theater show. The result of this is chaos: ICE caravans all over, vulnerable people terrified, less-vulnerable residents dropping their work to keep watch on the streets.

1/

0
0
0

LLM에서 마크다운이 널리 쓰이게 되면서 안 보고 싶어도 볼 수 밖에 없게 된 흔한 꼬라지로 그림에서 보는 것처럼 마크다운 강조 표시(**)가 그대로 노출되어 버리는 광경이 있다. 이 문제는 CommonMark의 고질적인 문제로, 한 10년 전쯤에 보고한 적도 있는데 지금까지 어떤 해결책도 제시되지 않은 채로 방치되어 있다.

문제의 상세는 이러하다. CommonMark는 마크다운을 표준화하는 과정에서 파싱의 복잡도를 제한하기 위해 연속된 구분자(delimiter run)라는 개념을 넣었는데, 연속된 구분자는 어느 방향에 있느냐에 따라서 왼편(left-flanking)과 오른편(right-flanking)이라는 속성을 가질 수 있다(왼편이자 오른편일 수도 있고, 둘 다 아닐 수도 있다). 이 규칙에 따르면 **는 왼편의 연속된 구분자로부터 시작해서 오른편의 연속된 구분자로 끝나야만 한다. 여기서 중요한 건 왼편인지 오른편인지를 판단하는 데 외부 맥락이 전혀 안 들어가고 주변의 몇 글자만 보고 바로 결정된다는 것인데, 이를테면 왼편의 연속된 구분자는 **<보통 글자> 꼴이거나 <공백>**<기호> 또는 <기호>**<기호> 꼴이어야 한다. ("보통 글자"란 공백이나 기호가 아닌 글자를 가리킨다.) 첫번째 꼴은 아무래도 **마크다운**은 같이 낱말 안에 끼어 들어가 있는 연속된 구분자를 허용하기 위한 것이고, 두번째/세번째 꼴은 이 **"마크다운"** 형식은 같이 기호 앞에 붙어 있는 연속된 구분자를 제한적으로 허용하기 위한 것이라 해석할 수 있겠다. 오른편도 방향만 다르고 똑같은 규칙을 가지는데, 이 규칙으로 **마크다운(Markdown)**은을 해석해 보면 뒷쪽 **의 앞에는 기호가 들어 있으므로 뒤에는 공백이나 기호가 나와야 하지만 보통 글자가 나왔으므로 오른편이 아니라고 해석되어 강조의 끝으로 처리되지 않는 것이다.

CommonMark 명세에서도 설명되어 있지만, 이 규칙의 원 의도는 **이런 **식으로** 중첩되어** 강조된 문법을 허용하기 위한 것이다. 강조를 한답시고 **이런 ** 식으로 공백을 강조 문법 안쪽에 끼워 넣는 일이 일반적으로는 없으므로, 이런 상황에서 공백에 인접한 강조 문법은 항상 특정 방향에만 올 수 있다고 선언하는 것으로 모호함을 해소하는 것이다. 허나 CJK 환경에서는 공백이 아예 없거나 공백이 있어도 한국어처럼 낱말 안에서 기호를 쓰는 경우가 드물지 않기 때문에, 이런 식으로 어느 연속된 구분자가 왼편인지 오른편인지 추론하는 데 한계가 있다는 것이다. 단순히 <보통 문자>**<기호>도 왼편으로 해석하는 식으로 해서 **마크다운(Markdown)**은 같은 걸 허용한다 하더라도, このような**[状況](...)**は 이런 상황은 어쩔 것인가? 내가 느끼기에는 중첩되어 강조된 문법의 효용은 제한적인 반면 이로 인해 생기는 CJK 환경에서의 불편함은 명확하다. 그리고 LLM은 CommonMark의 설계 의도 따위는 고려하지 않고 실제 사람들이 사용할 법한 식으로 마크다운을 쓰기 때문에, 사람들이 막연하게 가지고만 있던 이런 불편함이 그대로 표면화되어 버린 것이고 말이다.

* 21. Ba5# - 백이 룩과 퀸을 희생한 후, 퀸 대신 **비숍(Ba5)**이 결정적인 체크메이트를 성공시킵니다. 흑 킹이 탈출할 곳이 없으며, 백의 기물로 막을 수도 없습니다. [강조 처리된 "비숍(Ba5)" 앞뒤에 마크다운의 강조 표시 "**"가 그대로 노출되어 있다.]

As Markdown has become the standard for LLM outputs, we are now forced to witness a common and unsightly mess where Markdown emphasis markers (**) remain unrendered and exposed, as seen in the image. This is a chronic issue with the CommonMark specification---one that I once reported about ten years ago---but it has been left neglected without any solution to this day.

The technical details of the problem are as follows: In an effort to limit parsing complexity during the standardization process, CommonMark introduced the concept of "delimiter runs." These runs are assigned properties of being "left-flanking" or "right-flanking" (or both, or neither) depending on their position. According to these rules, a bolded segment must start with a left-flanking delimiter run and end with a right-flanking one. The crucial point is that whether a run is left- or right-flanking is determined solely by the immediate surrounding characters, without any consideration of the broader context. For instance, a left-flanking delimiter must be in the form of **<ordinary character>, <whitespace>**<punctuation>, or <punctuation>**<punctuation>. (Here, "ordinary character" refers to any character that is not whitespace or punctuation.) The first case is presumably intended to allow markers embedded within a word, like **마크다운**은, while the latter cases are meant to provide limited support for markers placed before punctuation, such as in 이 **"마크다운"** 형식은. The rules for right-flanking are identical, just in the opposite direction.

However, when you try to parse a string like **마크다운(Markdown)**은 using these rules, it fails because the closing ** is preceded by punctuation (a parenthesis) and it must be followed by whitespace or another punctuation mark to be considered right-flanking. Since it is followed by an ordinary letter (), it is not recognized as right-flanking and thus fails to close the emphasis.

As explained in the CommonMark spec, the original intent of this rule was to support nested emphasis, like **this **way** of nesting**. Since users typically don't insert spaces inside emphasis markers (e.g., **word **), the spec attempts to resolve ambiguity by declaring that markers adjacent to whitespace can only function in a specific direction. However, in CJK (Chinese, Japanese, Korean) environments, either spaces are completly absent or (as in Korean) punctuations are commonly used within a word. Consequently, there are clear limits to inferring whether a delimiter is left or right-flanking based on these rules. Even if we were to allow <ordinary character>**<punctuation> to be interpreted as left-flanking to accommodate cases like **마크다운(Markdown)**은, how would we handle something like このような**[状況](...)は**?

In my view, the utility of nested emphasis is marginal at best, while the frustration it causes in CJK environments is significant. Furthermore, because LLMs generate Markdown based on how people would actually use it---rather than strictly following the design intent of CommonMark---this latent inconvenience that users have long felt is now being brought directly to the surface.

* 21. Ba5# - 백이 룩과 퀸을 희생한 후, 퀸 대신 **비숍(Ba5)**이 결정적인 체크메이트를 성공시킵니다. 흑 킹이 탈출할 곳이 없으며, 백의 기물로 막을 수도 없습니다. [The emphasized portion `비숍(Ba5)` is surrounded by unrendered Markdown emphasis marks `**`.]
14
2
1
0

We're an AI first company. Our mission is to streamline your experience and let you turn ideas into execution at the speed of thought. To join our team make PDFs out of your cv and cover letter, upload the PDFs, then retype all the text in the PDFs into more textboxes. Copy/paste is disabled. A question that is illegal to ask is mandatory. <>[]{},%#$ characters forbidden. Accents forbidden. The back button breaks everything. You have been logged out for inactivity. Click here to restart.

0
0
0
0

My favorite thing on Bluesky is a labeler made by the Social Technologies Lab at Cornell Tech.

bsky.app/profile/did:plc:oubsy

It surfaces metadata on posts. Most useful to me is "This person posted more than 50 times yesterday."

It recasts a post like the one pictured. It reminds me to look at their profile and figure out if they are a real person, and are they terminally online, or engagement farming, or what. Then I might block/mute because I don't need people like that in my lifeworld.

Screenshot of post on Bluesky.

User: Brendel, ‪@brendelbored.bsky.social‬

Badge with the labeler logo: Posts a lot, more than 50 times yesterday

Who is your favorite shitty athlete? Someone who just wasn’t as good as everybody else bless them but you have a lot of affection for

7 reposts, 177 quotes, 174 likes, 4 saves
0
0
0
0
0
0
0
0
0

Hähnchenbrustinnenfilets, Chickenwings, Drumsticks usw.

Habt ihr mal gezählt, wie viele tote Tiere ihr da mindestens gerade esst?

Es erdet ungemein, wenn ihr realisiert, dass für die Menge von 10 Chickenwings auf eurem Teller rein rechnerisch mindestens 5, vielleicht sogar 10 Hühner getötet worden sind.

Unangenehm? Ja, das sollte es sein. Aber nicht, dass ich das sage, sondern dass ihr verdrängt, was ihr da tut.

Ist es das wirklich wert? Nur, weil es schmeckt?

0
0
0
0
0
0
0
0
0
0

We're an AI first company. Our mission is to streamline your experience and let you turn ideas into execution at the speed of thought. To join our team make PDFs out of your cv and cover letter, upload the PDFs, then retype all the text in the PDFs into more textboxes. Copy/paste is disabled. A question that is illegal to ask is mandatory. <>[]{},%#$ characters forbidden. Accents forbidden. The back button breaks everything. You have been logged out for inactivity. Click here to restart.

0
0
0
1
0
0
0
0
0
0
0
0
0
0
1
0
0
0