Issues with openid flow for dashboards

**On behalf of a user of Slack **

‘I might have a strange bug in the openid flow for dashboards. One of the guys in my team can’t use dashboards in 1 of our environments: 503. Strange thing is, we have 6 environments based on the exact same deployment and openid settings (confirmed by our IAM team), he can access all of them but one. Everyone else (6+ users) can access it fine. It’s been in several versions.
I’m collecting info and logs, is there someone I could exchange thoughts with once I have some more info?’

Hi Sander,

Please share your logs and configuration files and we will try to assist you.

Best,
Mantas

Hey guys, question, I have one user that can’t access one dashboard instance. Straneg thing is, I have 6 users accessing 6 different clusters (AzureAD - OpenID, same config for all). All is well, even for this user, he can login on 5 of 6 clusters except for 1 cluster where he can’t login. Dashboards 2.9.0 throws this error, but previous versions have done the same. The user has tried multiple systems and private windows. I’m a bit flabbergasted, any ideas?

{
 	“type”: “log”,
 	“@timestamp”: “2023-09-22T11:32:38Z”,
 	“tags”: [“error”, “http”],
 	“pid”: 1,
 	“message”: “ResponseError: Invalid index name [_iMRuKK5GNFqKeGB5JnLk6FmSAHSXVvFdSn40Bq3tJs], must not start with ‘_’.: invalid_index_name_exception: [invalid_index_ ││
 	name_exception] Reason: Invalid index name[_iMRuKK5GNFqKeGB5JnLk6FmSAHSXVvFdSn40Bq3tJs], must not start with ‘_’.\n at onBody(/usr/share / opensearch - dashboards / node_modules / @opensearch - project / opensearch / lib / Transport.js: 374: 23)\ n at Inco││ mingMessage.onEnd(/usr/share / opensearch - dashboards / node_modules / @opensearch - project / opensearch / lib / Transport.js: 293: 11)\ n at IncomingMessage.emit(node: events: 525: 35)\ n at IncomingMessage.emit(node: domain: 489: 12)\ n at endReadableNT(n││ ode: internal / streams / readable: 1358: 12)\ n at processTicksAndRejections(node: internal / process / task_queues: 83: 21) {
 		\
 		n meta: {
 			\
 			n body: {
 				error: [Object],
 				status: 400
 			},
 			\n statusCode: 400,
 			\n headers: {
 				\
 				n ‘x-opaque-id’: ‘5f6ab87f-9f ││
 				a1 - 44 c6 - 8 f75 - 76 d9a8053138 ‘,\n      ’
 				content - type ‘: ’
 				application / json;charset = UTF - 8 ‘,\n      ’
 				content - length ‘: ’
 				469 ‘\n    },\n    meta: {\n      context: null,\n      request: [Object],\n      name: ’
 				opensearch - js ’,\n      connection: [Object],\ ││
 				n attempts: 0,
 				\n aborted: false\ n
 			}\
 			n
 		}, \n isBoom: true, \n isServer: false, \n data: null, \n output: {
 			\
 			n statusCode: 400,
 			\n payload: {
 				\
 				n message: \ “Invalid index name [_iMRuKK5GNFqKeGB5JnLk6FmSAHSXVvFdSn40Bq3tJs], mus ││
 				t not start with ‘_’.: invalid_index_name_exception: [invalid_index_name_exception] Reason: Invalid index name[_iMRuKK5GNFqKeGB5JnLk6FmSAHSXVvFdSn40Bq3tJs],
 				must not start with ‘_’.\
 				“,\n      statusCode: 400,\n      error: ‘Bad Request’\n    }, ││\
				n headers: {}\
				n
 			},
 			\n[Symbol(SavedObjectsClientErrorCode)]: ‘SavedObjectsClient/badRequest’\
 			n
 		}
 		“}

Tried to export all saved objects since it seems related, but can’t find anything that points to the strange index name

It’s also not in the bookmark of the user, he is just trying to reach the dashboards application and getting a 500 in return:

image

@neographikal Does the reported user use the same AzureAD account for all environments or separate?

Same account, same config/roles yup

@neographikal Do you access the same way to all of the environments? Do you use any reverse proxy?
How did you deploy all your environments?

It’s a kubernetes setup, so there’s nginx in front of it. But basically I’ve got 1 terraform module that has config variables. All 6 clusters use this same terraform module, but just with different variables for the namespace/number of nodes/etc. There is no difference between the clusters as far as I can see.

The same goes for the AzureAD configs - we validated the settings are the same.

That’s why I’m puzzled.

@neographikal Could you describe the login flow for that reported user?

I understand that the user gets redirected to the Azure login page and then redirected back to OpenSearch Dashboards. Then error 503 is displayed in the WebUI. Is that correct?

Do you have any other errors in the OpenSearch or OpenSearch Dashboards logs preceding the reported error?

I don’t think that the security plugin can pass any index name in the login process. Also, the Azure login process generates only a token that is used to authenticate/authorize the user in the OpenSearch.

Regarding the Kubernetes deployment, is that on-premises or Azure k8s?

1 Like

I will take a screengrab this week when I can, but yeah you are correct. It’s an on premise deployment of kubernetes and our own terraform config for the clusters. No strange errors before the user logs in.

@neographikal Could you also share your config.yml (with openID configuration) and opensearch_dashboards.yml content?

Is there any chance to see the related Nginx config?

Sorry for my slow follow-up, been busy weeks and helping out testing some other parts of opensearch. Hopefully I’ll meet the dev in the office this week.